Pub Date : 2024-11-08DOI: 10.1016/j.bspc.2024.107137
Haozhen Xiang , Yuqi Xiong , Yingwei Shen , Jiaxin Li , Deshan Liu
Clinically, personalized treatment developed based on the immunohistochemical (IHC) molecular sub-types of breast cancer can enhance long-term survival rates. Nevertheless, IHC, as an invasive detection method, may pose some risks of tumor metastasis caused by puncture. This work propose a collaborative multi-task model based on multi-modal data. Firstly, a dual-stream learning network based on Swin Transformer is employed to extract features from both DCE and T1WI images. Specifically, an Shared Representation (SR) module extracts shared representations, while an Enhancement of Unique features (EU) module enhances specific features. Subsequently, a multi-path classification network is constructed, which comprehensively considers the MRI image features, lesion location, and morphological features. Comprehensive experiments using clinical MRI images show the proposed method outperforms state-of-the-art, with an accuracy of 85.1%, sensitivity of 84.0%, specificity of 95.1%, and an F1 score of 83.6%.
{"title":"A collaborative multi-task model for immunohistochemical molecular sub-types of multi-modal breast cancer MRI images","authors":"Haozhen Xiang , Yuqi Xiong , Yingwei Shen , Jiaxin Li , Deshan Liu","doi":"10.1016/j.bspc.2024.107137","DOIUrl":"10.1016/j.bspc.2024.107137","url":null,"abstract":"<div><div>Clinically, personalized treatment developed based on the immunohistochemical (IHC) molecular sub-types of breast cancer can enhance long-term survival rates. Nevertheless, IHC, as an invasive detection method, may pose some risks of tumor metastasis caused by puncture. This work propose a collaborative multi-task model based on multi-modal data. Firstly, a dual-stream learning network based on Swin Transformer is employed to extract features from both DCE and T1WI images. Specifically, an Shared Representation (SR) module extracts shared representations, while an Enhancement of Unique features (EU) module enhances specific features. Subsequently, a multi-path classification network is constructed, which comprehensively considers the MRI image features, lesion location, and morphological features. Comprehensive experiments using clinical MRI images show the proposed method outperforms state-of-the-art, with an accuracy of 85.1%, sensitivity of 84.0%, specificity of 95.1%, and an F1 score of 83.6%.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107137"},"PeriodicalIF":4.9,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep learning (DL) methods have emerged as the state-of-the-art for Magnetic Resonance Imaging (MRI) reconstruction. DL methods typically involve training deep neural networks to take undersampled MRI images as input and transform them into high-quality MRI images through data-driven processes. However, deep learning models often fail with higher levels of undersampling due to the insufficient information in the input, which is crucial for producing high-quality MRI images. Thus, optimizing the information content at the input of a DL reconstruction model could significantly improve reconstruction accuracy. In this paper, we introduce a self-supervised pretraining procedure using contrastive learning to improve the accuracy of undersampled DL MRI reconstruction. We use contrastive learning to transform the MRI image representations into a latent space that maximizes mutual information among different undersampled representations and optimizes the information content at the input of the downstream DL reconstruction models. Our experiments demonstrate improved reconstruction accuracy across a range of acceleration factors and datasets, both quantitatively and qualitatively. Furthermore, our extended experiments validate the proposed framework’s robustness under adversarial conditions, such as measurement noise, different k-space sampling patterns, and pathological abnormalities, and also prove the transfer learning capabilities on MRI datasets with completely different anatomy. Additionally, we conducted experiments to visualize and analyze the properties of the proposed MRI contrastive learning latent space. Code available here.
{"title":"CL-MRI: Self-Supervised contrastive learning to improve the accuracy of undersampled MRI reconstruction","authors":"Mevan Ekanayake , Zhifeng Chen , Mehrtash Harandi , Gary Egan , Zhaolin Chen","doi":"10.1016/j.bspc.2024.107185","DOIUrl":"10.1016/j.bspc.2024.107185","url":null,"abstract":"<div><div>Deep learning (DL) methods have emerged as the state-of-the-art for Magnetic Resonance Imaging (MRI) reconstruction. DL methods typically involve training deep neural networks to take undersampled MRI images as input and transform them into high-quality MRI images through data-driven processes. However, deep learning models often fail with higher levels of undersampling due to the insufficient information in the input, which is crucial for producing high-quality MRI images. Thus, optimizing the information content at the input of a DL reconstruction model could significantly improve reconstruction accuracy. In this paper, we introduce a self-supervised pretraining procedure using contrastive learning to improve the accuracy of undersampled DL MRI reconstruction. We use contrastive learning to transform the MRI image representations into a latent space that maximizes mutual information among different undersampled representations and optimizes the information content at the input of the downstream DL reconstruction models. Our experiments demonstrate improved reconstruction accuracy across a range of acceleration factors and datasets, both quantitatively and qualitatively. Furthermore, our extended experiments validate the proposed framework’s robustness under adversarial conditions, such as measurement noise, different k-space sampling patterns, and pathological abnormalities, and also prove the transfer learning capabilities on MRI datasets with completely different anatomy. Additionally, we conducted experiments to visualize and analyze the properties of the proposed MRI contrastive learning latent space. Code available <span><span>here</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107185"},"PeriodicalIF":4.9,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-08DOI: 10.1016/j.bspc.2024.107171
Imran Ul Haq , Haider Ali , Yuefeng Li , Zhe Liu
Introduction
Ultrasonography is among the most regularly used methods for earlier detection of breast cancer. Automatic and precise segmentation of breast masses in breast ultrasound (US) images is essential but still a challenge due to several causes of uncertainties, like the high variety of tumor shapes and sizes, obscure tumor borders, very low SNR, and speckle noise.
Method
To deal with these uncertainties, this work presents an effective and automated GAN based approach for tumor segmentation in breast US named MAR-GAN, to extract rich, informative features from US images. In MAR-GAN the capabilities of the traditional encoder-decoder generator were enhanced by multiple modifications. Multi-scale residual blocks were used to retrieve additional aspects of the tumor area for a more precise description. A novel boundary and foreground attention (BFA) module is proposed to increase attention for the tumor region and boundary curve. The squeeze and excitation (SE) and the adaptive context selection (ACS) modules were added to increase representational capability on encoder side and facilitates better selection and aggregation of contextual information on the decoder side respectively. The L1-norm and structural similarity index metric (SSIM) were added into the MAR-GAN’s loss function to capture rich local context information from the tumors’ surroundings.
Results
Two breast US datasets were utilized to evaluate the effectiveness of the suggested approach. Using the BUSI dataset, our network outperformed several state-of-the-art segmentations models in IoU and Dice metrics, scoring 89.27 %, 94.21 %, respectively. The suggested approach achieved encouraging results on UDIAT dataset, with IoU and Dice scores of 82.75 %, 88.54 %, respectively.
{"title":"MAR-GAN: Multi attention residual generative adversarial network for tumor segmentation in breast ultrasounds","authors":"Imran Ul Haq , Haider Ali , Yuefeng Li , Zhe Liu","doi":"10.1016/j.bspc.2024.107171","DOIUrl":"10.1016/j.bspc.2024.107171","url":null,"abstract":"<div><h3>Introduction</h3><div>Ultrasonography is among the most regularly used methods for earlier detection of breast cancer. Automatic and precise segmentation of breast masses in breast ultrasound (US) images is essential but still a challenge due to several causes of uncertainties, like the high variety of tumor shapes and sizes, obscure tumor borders, very low SNR, and speckle noise.</div></div><div><h3>Method</h3><div>To deal with these uncertainties, this work presents an effective and automated GAN based approach for tumor segmentation in breast US named MAR-GAN, to extract rich, informative features from US images. In MAR-GAN the capabilities of the traditional encoder-decoder generator were enhanced by multiple modifications. Multi-scale residual blocks were used to retrieve additional aspects of the tumor area for a more precise description. A novel boundary and foreground attention (BFA) module is proposed to increase attention for the tumor region and boundary curve. The squeeze and excitation (SE) and the adaptive context selection (ACS) modules were added to increase representational capability on encoder side and facilitates better selection and aggregation of contextual information on the decoder side respectively. The L1-norm and structural similarity index metric (SSIM) were added into the MAR-GAN’s loss function to capture rich local context information from the tumors’ surroundings.</div></div><div><h3>Results</h3><div>Two breast US datasets were utilized to evaluate the effectiveness of the suggested approach. Using the BUSI dataset, our network outperformed several state-of-the-art segmentations models in IoU and Dice metrics, scoring 89.27 %, 94.21 %, respectively. The suggested approach achieved encouraging results on UDIAT dataset, with IoU and Dice scores of 82.75 %, 88.54 %, respectively.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107171"},"PeriodicalIF":4.9,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.bspc.2024.107178
Prithwijit Mukherjee, Anisha Halder Roy
In the modern era, a significant percentage of people around the world suffer from knee pain-related problems. ‘Knee pain’ can be alleviated by performing knee rehabilitation exercises in the correct posture on a regular basis. In our research, an attention mechanism-based CNN-TLSTM (Convolution Neural Network-tanh Long Sort-Term Memory) network has been proposed for assessing the knee pain level of a person. Here, electroencephalogram (EEG) signals of the frontal, parietal, and temporal lobes, electromyography (EMG) signals of the hamstring and quadriceps muscles, and knee bending angle have been used for knee pain detection. First, the CNN network has been utilized for automated feature extraction from the EEG, knee bending angle, and EMG data, and subsequently, the TLSTM network has been used as a classifier. The trained CNN-TLSTM model can classify the knee pain level of a person into five categories, namely no pain, low pain, medium pain, moderate pain, and high pain, with an overall accuracy of 95.88 %. In the hardware part, a prototype of an automated robotic knee rehabilitation system has been designed to help a person perform three rehabilitation exercises, i.e., sitting knee bending, straight leg rise, and active knee bending, according to his/her pain level, without the presence of any physiotherapist. The novelty of our research lies in (i) designing a novel deep learning-based classifier model for broadly classifying knee pain into five categories, (ii) introducing attention mechanism into the TLSTM network to boost its classification performance, and (iii) developing a user-friendly rehabilitation device for knee rehabilitation.
{"title":"A deep learning-based comprehensive robotic system for lower limb rehabilitation","authors":"Prithwijit Mukherjee, Anisha Halder Roy","doi":"10.1016/j.bspc.2024.107178","DOIUrl":"10.1016/j.bspc.2024.107178","url":null,"abstract":"<div><div>In the modern era, a significant percentage of people around the world suffer from knee pain-related problems. ‘Knee pain’ can be alleviated by performing knee rehabilitation exercises in the correct posture on a regular basis. In our research, an attention mechanism-based CNN-TLSTM (Convolution Neural Network-tanh Long Sort-Term Memory) network has been proposed for assessing the knee pain level of a person. Here, electroencephalogram (EEG) signals of the frontal, parietal, and temporal lobes, electromyography (EMG) signals of the hamstring and quadriceps muscles, and knee bending angle have been used for knee pain detection. First, the CNN network has been utilized for automated feature extraction from the EEG, knee bending angle, and EMG data, and subsequently, the TLSTM network has been used as a classifier. The trained CNN-TLSTM model can classify the knee pain level of a person into five categories, namely no pain, low pain, medium pain, moderate pain, and high pain, with an overall accuracy of 95.88 %. In the hardware part, a prototype of an automated robotic knee rehabilitation system has been designed to help a person perform three rehabilitation exercises, i.e., sitting knee bending, straight leg rise, and active knee bending, according to his/her pain level, without the presence of any physiotherapist. The novelty of our research lies in (i) designing a novel deep learning-based classifier model for broadly classifying knee pain into five categories, (ii) introducing attention mechanism into the TLSTM network to boost its classification performance, and (iii) developing a user-friendly rehabilitation device for knee rehabilitation.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107178"},"PeriodicalIF":4.9,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.bspc.2024.107160
Xinghang Wang , Haibo Tao , Bin Wang , Huaiping Jin , Zhenhui Li
Accurate detection of histopathological cancer subtypes is crucial for personalized treatment. Currently, deep learning methods based on histopathology images have become an effective solution to this problem. However, existing deep learning methods for histopathology image classification often suffer from high computational complexity, not considering the variability of different regions, and failing to synchronize the focus on local–global information effectively. To address these issues, we propose a coarse-to-fine inference based vision transformer (ViT) network (CFI-ViT) for pathological image detection of gastric cancer subtypes. CFI-ViT combines global attention and discriminative and differentiable modules to achieve two-stage inference. In the coarse inference stage, a ViT model with relative position embedding is employed to extract global information from the input images. If the critical information is not sufficiently identified, the differentiable module is adopted to extract local image regions with discrimination for fine-grained screening in the fine inference stage. The effectiveness and superiority of the proposed CFI-ViT method have been validated through three pathological image datasets of gastric cancer, including one private dataset clinically collected from Yunnan Cancer Hospital in China and two publicly available datasets, i.e., HE-GHI-DS and TCGA-STAD. The experimental results demonstrate that CFI-ViT achieves superior recognition accuracy and generalization performance compared to traditional methods, while using only 80 % of the computational resources required by the ViT model.
准确检测组织病理学癌症亚型对于个性化治疗至关重要。目前,基于组织病理学图像的深度学习方法已成为解决这一问题的有效方法。然而,现有的组织病理学图像分类深度学习方法往往存在计算复杂度高、未考虑不同区域的差异性、无法有效同步关注局部和全局信息等问题。为了解决这些问题,我们提出了一种基于视觉变换器(ViT)的粗到细推理网络(CFI-ViT),用于胃癌亚型的病理图像检测。CFI-ViT 结合了全局注意力、判别和可微分模块,实现了两阶段推理。在粗推理阶段,采用具有相对位置嵌入的 ViT 模型从输入图像中提取全局信息。如果关键信息识别不充分,则在精细推理阶段采用可微分模块提取具有区分度的局部图像区域,进行细粒度筛选。我们通过三个胃癌病理图像数据集验证了 CFI-ViT 方法的有效性和优越性,其中包括一个从中国云南省肿瘤医院临床收集的私有数据集和两个公开数据集,即 HE-GHI-DS 和 TCGA-STAD。实验结果表明,与传统方法相比,CFI-ViT 获得了更高的识别准确率和泛化性能,而所需的计算资源仅为 ViT 模型的 80%。
{"title":"CFI-ViT: A coarse-to-fine inference based vision transformer for gastric cancer subtype detection using pathological images","authors":"Xinghang Wang , Haibo Tao , Bin Wang , Huaiping Jin , Zhenhui Li","doi":"10.1016/j.bspc.2024.107160","DOIUrl":"10.1016/j.bspc.2024.107160","url":null,"abstract":"<div><div>Accurate detection of histopathological cancer subtypes is crucial for personalized treatment. Currently, deep learning methods based on histopathology images have become an effective solution to this problem. However, existing deep learning methods for histopathology image classification often suffer from high computational complexity, not considering the variability of different regions, and failing to synchronize the focus on local–global information effectively. To address these issues, we propose a coarse-to-fine inference based vision transformer (ViT) network (CFI-ViT) for pathological image detection of gastric cancer subtypes. CFI-ViT combines global attention and discriminative and differentiable modules to achieve two-stage inference. In the coarse inference stage, a ViT model with relative position embedding is employed to extract global information from the input images. If the critical information is not sufficiently identified, the differentiable module is adopted to extract local image regions with discrimination for fine-grained screening in the fine inference stage. The effectiveness and superiority of the proposed CFI-ViT method have been validated through three pathological image datasets of gastric cancer, including one private dataset clinically collected from Yunnan Cancer Hospital in China and two publicly available datasets, i.e., HE-GHI-DS and TCGA-STAD. The experimental results demonstrate that CFI-ViT achieves superior recognition accuracy and generalization performance compared to traditional methods, while using only 80 % of the computational resources required by the ViT model.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107160"},"PeriodicalIF":4.9,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.bspc.2024.107186
Chongbo Yin , Jian Qin , Yan Shi , Yineng Zheng , Xingming Guo
Heart sound auscultation coupled with machine learning algorithms is a risk-free and low-cost method for coronary artery disease detection (CAD). However, current studies mainly focus on CAD screening, namely classifying CAD and non-CAD, due to limited clinical data and algorithm performance. This leaves a gap to investigate CAD severity by phonocardiogram (PCG). To solve the issue, we first establish a clinical PCG dataset for CAD patients. The dataset includes 150 subjects with 80 severe CAD and 70 non-severe CAD patients. Then, we propose the large kernel convolution interaction network (LKCIN) to detect CAD severity. It integrates automatic feature extraction and pattern classification and simplifies PCG processing steps. The developed large kernel interaction block (LKIB) has three properties: long-distance dependency, local receptive field, and channel interaction, which efficiently improves feature extraction capabilities in LKCIN. Apart from it, a separate downsampling block is proposed to alleviate feature losses during forward propagation, following the LKIBs. Experiment is performed on the clinical PCG data, and LKCIN obtains good classification performance with accuracy 85.97 %, sensitivity 85.64 %, and specificity 86.26 %. Our study breaks conventional CAD screening, and provides a reliable option for CAD severity detection in clinical practice.
{"title":"Detection of severe coronary artery disease based on clinical phonocardiogram and large kernel convolution interaction network","authors":"Chongbo Yin , Jian Qin , Yan Shi , Yineng Zheng , Xingming Guo","doi":"10.1016/j.bspc.2024.107186","DOIUrl":"10.1016/j.bspc.2024.107186","url":null,"abstract":"<div><div>Heart sound auscultation coupled with machine learning algorithms is a risk-free and low-cost method for coronary artery disease detection (CAD). However, current studies mainly focus on CAD screening, namely classifying CAD and non-CAD, due to limited clinical data and algorithm performance. This leaves a gap to investigate CAD severity by phonocardiogram (PCG). To solve the issue, we first establish a clinical PCG dataset for CAD patients. The dataset includes 150 subjects with 80 severe CAD and 70 non-severe CAD patients. Then, we propose the large kernel convolution interaction network (LKCIN) to detect CAD severity. It integrates automatic feature extraction and pattern classification and simplifies PCG processing steps. The developed large kernel interaction block (LKIB) has three properties: long-distance dependency, local receptive field, and channel interaction, which efficiently improves feature extraction capabilities in LKCIN. Apart from it, a separate downsampling block is proposed to alleviate feature losses during forward propagation, following the LKIBs. Experiment is performed on the clinical PCG data, and LKCIN obtains good classification performance with accuracy 85.97 %, sensitivity 85.64 %, and specificity 86.26 %. Our study breaks conventional CAD screening, and provides a reliable option for CAD severity detection in clinical practice.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107186"},"PeriodicalIF":4.9,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.bspc.2024.107166
Xue Yuan , Maozhou Chen , Peng Ding , Anan Gan , Keren Shi , Anming Gong , Lei Zhao , Tianwen Li , Yunfa Fu , Yuqi Cheng
Objectives
Establishing objective and quantitative imaging markers at individual level can assist in accurate diagnosis of Major Depressive Disorder (MDD). However, the clinical heterogeneity of MDD leads to a decrease in recognition accuracy, to address this issue, we propose the Windowed Attention Aggregation Network (WAAN) for a medium-sized functional Magnetic Resonance Imaging (fMRI) dataset comprising 111 MDD and 106 Healthy Controls (HC).
Methods
The proposed WAAN model is a dynamic temporal model that contains two important components, Inner-Window Self-Attention (IWSA) and Cross-Window Self-Attention (CWSA), to characterize the MDD-fMRI data at a fine-grained level and fuse global temporal information. In addition, to optimize WAAN, a new Point to Domain Loss (p2d Loss) function is proposed, which intermediate guides the model to learn class centers with smaller class deviations, thus improving the intra-class feature density.
Results
The proposed WAAN achieved an accuracy of 83.8 % (±1.4 %) in MDD identification task in medium-sized site. The right superior orbitofrontal gyrus and right superior temporal gyrus (pole) were found to be categorically highly attributable brain regions in MDD patients, and the hippocampus had stable categorical attributions. The effect of temporal parameters on classification was also explored and time window parameters for high categorical attributions were obtained.
Significance
The proposed WAAN is expected to improve the accuracy of personalized identification of MDD. This study helps to find the target brain regions for treatment or intervention of MDD, and provides better scanning time window parameters for MDD-fMRI analysis.
{"title":"Intermediary-guided windowed attention Aggregation network for fine-grained characterization of Major Depressive Disorder fMRI","authors":"Xue Yuan , Maozhou Chen , Peng Ding , Anan Gan , Keren Shi , Anming Gong , Lei Zhao , Tianwen Li , Yunfa Fu , Yuqi Cheng","doi":"10.1016/j.bspc.2024.107166","DOIUrl":"10.1016/j.bspc.2024.107166","url":null,"abstract":"<div><h3>Objectives</h3><div>Establishing objective and quantitative imaging markers at individual level can assist in accurate diagnosis of Major Depressive Disorder (MDD). However, the clinical heterogeneity of MDD leads to a decrease in recognition accuracy, to address this issue, we propose the Windowed Attention Aggregation Network (WAAN) for a medium-sized functional Magnetic Resonance Imaging (fMRI) dataset comprising 111 MDD and 106 Healthy Controls (HC).</div></div><div><h3>Methods</h3><div>The proposed WAAN model is a dynamic temporal model that contains two important components, Inner-Window Self-Attention (IWSA) and Cross-Window Self-Attention (CWSA), to characterize the MDD-fMRI data at a fine-grained level and fuse global temporal information. In addition, to optimize WAAN, a new Point to Domain Loss (p2d Loss) function is proposed, which intermediate guides the model to learn class centers with smaller class deviations, thus improving the intra-class feature density.</div></div><div><h3>Results</h3><div>The proposed WAAN achieved an accuracy of 83.8 % (±1.4 %) in MDD identification task in medium-sized site. The right superior orbitofrontal gyrus and right superior temporal gyrus (pole) were found to be categorically highly attributable brain regions in MDD patients, and the hippocampus had stable categorical attributions. The effect of temporal parameters on classification was also explored and time window parameters for high categorical attributions were obtained.</div></div><div><h3>Significance</h3><div>The proposed WAAN is expected to improve the accuracy of personalized identification of MDD. This study helps to find the target brain regions for treatment or intervention of MDD, and provides better scanning time window parameters for MDD-fMRI analysis.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107166"},"PeriodicalIF":4.9,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.bspc.2024.107112
Pengfei Hou , Xiaowei Li , Jing Zhu , Bin Hu , Fellow IEEE
Depression is a serious mental health condition affecting hundreds of millions of people worldwide. Electroencephalogram (EEG) is a spontaneous and rhythmic physiological signal capable of measuring the brain activity of subjects, serving as an objective biomarker for depression research. This paper proposes a lightweight Convolutional Transformer neural network (LCTNN) for depression identification. LCTNN features three significant characteristics: (1) It combines the advantages of both CNN and Transformer to learn rich EEG signal representations from local to global perspectives in time domain. (2) Channel Modulator (CM) dynamically adjusts the contribution of each electrode channel of the EEG signal to depression identification. (3) Considering the high temporal resolution of EEG signals imposes a significant burden on computing self-attention, LCTNN replaces canonical self-attention with sparse attention, reducing its spatiotemporal complexity to . Furthermore, this paper incorporates an attention pooling operation between two Transformer layers, further reducing the spatial complexity. Compared to other deep learning methods, LCTNN achieved state-of-the-art performance on the majority of metrics across two datasets. This indicates that LCTNN offers new insights into the relationship between EEG signals and depression, providing a valuable reference for the future development of depression diagnosis and treatment.
{"title":"A lightweight convolutional transformer neural network for EEG-based depression recognition","authors":"Pengfei Hou , Xiaowei Li , Jing Zhu , Bin Hu , Fellow IEEE","doi":"10.1016/j.bspc.2024.107112","DOIUrl":"10.1016/j.bspc.2024.107112","url":null,"abstract":"<div><div>Depression is a serious mental health condition affecting hundreds of millions of people worldwide. Electroencephalogram (EEG) is a spontaneous and rhythmic physiological signal capable of measuring the brain activity of subjects, serving as an objective biomarker for depression research. This paper proposes a lightweight Convolutional Transformer neural network (LCTNN) for depression identification. LCTNN features three significant characteristics: (1) It combines the advantages of both CNN and Transformer to learn rich EEG signal representations from local to global perspectives in time domain. (2) Channel Modulator (CM) dynamically adjusts the contribution of each electrode channel of the EEG signal to depression identification. (3) Considering the high temporal resolution of EEG signals imposes a significant burden on computing self-attention, LCTNN replaces canonical self-attention with sparse attention, reducing its spatiotemporal complexity to <span><math><mrow><mi>O</mi><mo>(</mo><mi>L</mi><mi>log</mi><mi>L</mi><mo>)</mo></mrow></math></span>. Furthermore, this paper incorporates an attention pooling operation between two Transformer layers, further reducing the spatial complexity. Compared to other deep learning methods, LCTNN achieved state-of-the-art performance on the majority of metrics across two datasets. This indicates that LCTNN offers new insights into the relationship between EEG signals and depression, providing a valuable reference for the future development of depression diagnosis and treatment.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107112"},"PeriodicalIF":4.9,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-06DOI: 10.1016/j.bspc.2024.107139
P. Lavanya , K. Vidhya
Cancer is regarded as one of the life-threatening diseases since it causes significant number of fatalities in every year. Among different cancer types, the lung cancer is considered as the most destructive type with largest mortality rate. Therefore, an effective and accurate technique for detecting the lung cancer is crucial for providing the adequate treatment on time. This study presents a novel deep learning-based lung cancer detection method. The technique of image processing comprises of four major phases. Initially, the pre-processing of input images is carried out with the implementation of Adaptive Wiener filter for successfully eliminating the noises in the image without making any edge loss. Then, the process of segmentation is executed using Cascaded K-means Fuzzy C-means (KM-FCM) algorithm. The stages of feature extraction and selection are carried out using Radiomics approach, which aids in the extraction and selection of meaningful features that facilitates cancer detection. The final stage of image processing is classification, which is accomplished by a novel Locust assisted Crow Search (CS) based Convolutional Neural Network (CNN) classifier. The proposed digital image processing technique displays an impressive performance in detecting lung cancer with an accuracy of 96.33%.
癌症被认为是威胁生命的疾病之一,因为它每年都会造成大量死亡。在各种癌症类型中,肺癌被认为是最具破坏性、死亡率最高的类型。因此,有效而准确的肺癌检测技术对于及时提供适当的治疗至关重要。本研究提出了一种基于深度学习的新型肺癌检测方法。图像处理技术包括四个主要阶段。首先,使用自适应维纳滤波器对输入图像进行预处理,以成功消除图像中的噪音,同时不会造成任何边缘损失。然后,使用级联 K 均值模糊 C 均值(KM-FCM)算法执行分割过程。特征提取和选择阶段采用放射组学方法,该方法有助于提取和选择有意义的特征,从而有助于癌症检测。图像处理的最后阶段是分类,由基于卷积神经网络(CNN)的新型蝗虫辅助乌鸦搜索(CS)分类器完成。所提出的数字图像处理技术在检测肺癌方面表现出色,准确率高达 96.33%。
{"title":"A novel lung cancer detection adopting Radiomic feature extraction with Locust assisted CS based CNN classifier","authors":"P. Lavanya , K. Vidhya","doi":"10.1016/j.bspc.2024.107139","DOIUrl":"10.1016/j.bspc.2024.107139","url":null,"abstract":"<div><div>Cancer is regarded as one of the life-threatening diseases since it causes significant number of fatalities in every year. Among different cancer types, the lung cancer is considered as the most destructive type with largest mortality rate. Therefore, an effective and accurate technique for detecting the lung cancer is crucial for providing the adequate treatment on time. This study presents a novel deep learning-based lung cancer detection method. The technique of image processing comprises of four major phases. Initially, the pre-processing of input images is carried out with the implementation of Adaptive Wiener filter for successfully eliminating the noises in the image without making any edge loss. Then, the process of segmentation is executed using Cascaded K-means Fuzzy C-means (KM-FCM) algorithm. The stages of feature extraction and selection are carried out using Radiomics approach, which aids in the extraction and selection of meaningful features that facilitates cancer detection. The final stage of image processing is classification, which is accomplished by a novel Locust assisted Crow Search (CS) based Convolutional Neural Network (CNN) classifier. The proposed digital image processing technique displays an impressive performance in detecting lung cancer with an accuracy of 96.33%.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107139"},"PeriodicalIF":4.9,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, the preliminary diagnosis of Attention Deficit Hyperactivity Disorder (ADHD) using electroencephalography (EEG) has attracted the attention from researchers. EEG, known for its expediency and efficiency, plays a pivotal role in the diagnosis and treatment of ADHD. However, the non-stationarity of EEG signals and inter-subject variability pose challenges to the diagnostic and classification processes. Topological Data Analysis (TDA) offers a novel perspective for ADHD classification, diverging from traditional time–frequency domain features. However, conventional TDA models are restricted to single-channel time series and are susceptible to noise, leading to the loss of topological features in persistence diagrams.This paper presents an enhanced TDA approach applicable to multi-channel EEG in ADHD. Initially, optimal input parameters for multi-channel EEG are determined. Subsequently, each channel’s EEG undergoes phase space reconstruction (PSR) followed by the utilization of k-Power Distance to Measure (k-PDTM) for approximating ideal point clouds. Then, multi-dimensional time series are re-embedded, and TDA is applied to obtain topological feature information. Gaussian function-based Multivariate Kernel Density Estimation (MKDE) is employed in the merger persistence diagram to filter out desired topological feature mappings. Finally, the persistence image (PI) method is employed to extract topological features, and the influence of various weighting functions on the results is discussed.The effectiveness of our method is evaluated using the IEEE ADHD dataset. Results demonstrate that the accuracy, sensitivity, and specificity reach 78.27%, 80.62%, and 75.63%, respectively. Compared to traditional TDA methods, our method was effectively improved and outperforms typical nonlinear descriptors. These findings indicate that our method exhibits higher precision and robustness.
{"title":"Topological feature search method for multichannel EEG: Application in ADHD classification","authors":"Tianming Cai , Guoying Zhao , Junbin Zang , Chen Zong , Zhidong Zhang , Chenyang Xue","doi":"10.1016/j.bspc.2024.107153","DOIUrl":"10.1016/j.bspc.2024.107153","url":null,"abstract":"<div><div>In recent years, the preliminary diagnosis of Attention Deficit Hyperactivity Disorder (ADHD) using electroencephalography (EEG) has attracted the attention from researchers. EEG, known for its expediency and efficiency, plays a pivotal role in the diagnosis and treatment of ADHD. However, the non-stationarity of EEG signals and inter-subject variability pose challenges to the diagnostic and classification processes. Topological Data Analysis (TDA) offers a novel perspective for ADHD classification, diverging from traditional time–frequency domain features. However, conventional TDA models are restricted to single-channel time series and are susceptible to noise, leading to the loss of topological features in persistence diagrams.This paper presents an enhanced TDA approach applicable to multi-channel EEG in ADHD. Initially, optimal input parameters for multi-channel EEG are determined. Subsequently, each channel’s EEG undergoes phase space reconstruction (PSR) followed by the utilization of k-Power Distance to Measure (k-PDTM) for approximating ideal point clouds. Then, multi-dimensional time series are re-embedded, and TDA is applied to obtain topological feature information. Gaussian function-based Multivariate Kernel Density Estimation (MKDE) is employed in the merger persistence diagram to filter out desired topological feature mappings. Finally, the persistence image (PI) method is employed to extract topological features, and the influence of various weighting functions on the results is discussed.The effectiveness of our method is evaluated using the IEEE ADHD dataset. Results demonstrate that the accuracy, sensitivity, and specificity reach 78.27%, 80.62%, and 75.63%, respectively. Compared to traditional TDA methods, our method was effectively improved and outperforms typical nonlinear descriptors. These findings indicate that our method exhibits higher precision and robustness.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107153"},"PeriodicalIF":4.9,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}