Ocular disorders affect over 2.2 billion people globally, with glaucoma being a leading cause of blindness in India. Early detection of glaucoma is crucial as it gradually damages the optic nerve due to increased fluid pressure, leading to vision impairment. This study introduces an innovative approach for glaucoma detection and diagnosis, utilizing two-dimensional Fourier-Bessel series expansion-based empirical wavelet transforms (2D-FBSE-EWT) combined with a memristive crossbar array (MCA) model. The proposed method leverages deep learning and an ensemble EfficientNetb0 based technique to classify fundus images as either normal or glaucomatous. EfficientNetb0 outperforms compared to other convolutional neural networks (CNNs) such as ResNet50, AlexNet, and GoogleNet, making it the optimal choice for glaucoma classification. Initially, the dataset was processed using the integrated MCA with 2D-FBSE-EWT model, and the reconstructed images were used for further classification. The assessment parameters of the reconstructed images demonstrated high quality, with peak signal-to-noise ratio (PSNR) of 26.2346 dB and structural similarity index (SSIM) of 95.38 %. The proposed method achieved an impressive accuracy of 94.15 % using EfficientNetb0. Additionally, it enhanced accuracy and sensitivity by 32.14 % and 40.93 %, respectively, compared to the unprocessed dataset.
{"title":"Implementation of FBSE-EWT method in memristive crossbar array framework for automated glaucoma diagnosis from fundus images","authors":"Kumari Jyoti , Saurabh Yadav , Chandrabhan Patel , Mayank Dubey , Pradeep Kumar Chaudhary , Ram Bilas Pachori , Shaibal Mukherjee","doi":"10.1016/j.bspc.2024.107087","DOIUrl":"10.1016/j.bspc.2024.107087","url":null,"abstract":"<div><div>Ocular disorders affect over 2.2 billion people globally, with glaucoma being a leading cause of blindness in India. Early detection of glaucoma is crucial as it gradually damages the optic nerve due to increased fluid pressure, leading to vision impairment. This study introduces an innovative approach for glaucoma detection and diagnosis, utilizing two-dimensional Fourier-Bessel series expansion-based empirical wavelet transforms (2D-FBSE-EWT) combined with a memristive crossbar array (MCA) model. The proposed method leverages deep learning and an ensemble EfficientNetb0 based technique to classify fundus images as either normal or glaucomatous. EfficientNetb0 outperforms compared to other convolutional neural networks (CNNs) such as ResNet50, AlexNet, and GoogleNet, making it the optimal choice for glaucoma classification. Initially, the dataset was processed using the integrated MCA with 2D-FBSE-EWT model, and the reconstructed images were used for further classification. The assessment parameters of the reconstructed images demonstrated high quality, with peak signal-to-noise ratio (PSNR) of 26.2346 dB and structural similarity index (SSIM) of 95.38 %. The proposed method achieved an impressive accuracy of 94.15 % using EfficientNetb0. Additionally, it enhanced accuracy and sensitivity by 32.14 % and 40.93 %, respectively, compared to the unprocessed dataset.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107087"},"PeriodicalIF":4.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15DOI: 10.1016/j.bspc.2024.107152
Alan Spark , Jan Kohout , Ludmila Verešpejová , Martin Chovanec , Jan Mareš
This paper introduces a systematic classification of the facial nerve grading system using a comprehensive methodology using a pioneering Multi-Path Heterogeneous Neural Network (MPHNN) method designed for the accurate classification of exercise. It integrates four distinct Convolutional Neural Networks (CNNs) and Custom Feedforward Neural Networks (CFNNs) to enhance the precision of the classification. The CNNs are specifically tailored to scrutinize changes in the coordinates of facial landmarks over time, enabling the capture of both spatial information and temporal patterns in facial expressions during exercise. The CFNNs incorporate patient-specific variables and exercise statistics, including factors such as their surgical history, the type of exercise, its duration, and synthetic features like cumulative movement for each landmark. By leveraging this comprehensive framework, the proposed method offers a nuanced representation of the patient’s exercise performance, thereby facilitating more precise outcomes of a classification.
{"title":"Multi Path Heterogeneous Neural Networks: Novel comprehensive classification method of facial nerve function","authors":"Alan Spark , Jan Kohout , Ludmila Verešpejová , Martin Chovanec , Jan Mareš","doi":"10.1016/j.bspc.2024.107152","DOIUrl":"10.1016/j.bspc.2024.107152","url":null,"abstract":"<div><div>This paper introduces a systematic classification of the facial nerve grading system using a comprehensive methodology using a pioneering Multi-Path Heterogeneous Neural Network (MPHNN) method designed for the accurate classification of exercise. It integrates four distinct Convolutional Neural Networks (CNNs) and Custom Feedforward Neural Networks (CFNNs) to enhance the precision of the classification. The CNNs are specifically tailored to scrutinize changes in the coordinates of facial landmarks over time, enabling the capture of both spatial information and temporal patterns in facial expressions during exercise. The CFNNs incorporate patient-specific variables and exercise statistics, including factors such as their surgical history, the type of exercise, its duration, and synthetic features like cumulative movement for each landmark. By leveraging this comprehensive framework, the proposed method offers a nuanced representation of the patient’s exercise performance, thereby facilitating more precise outcomes of a classification.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"101 ","pages":"Article 107152"},"PeriodicalIF":4.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15DOI: 10.1016/j.bspc.2024.107055
Zhoushan Feng , Yuliang Zhang , Yanhong Chen , Yiyu Shi , Yu Liu , Wen Sun , Lili Du , Dunjin Chen
Polyp segmentation in colonoscopy imagery is a critical procedure in the early detection and preemptive management of colorectal cancer. In facilitating the diagnostic procedures, it is pivotal to attain segmentation with high precision, emphasizing fine-grained details which can potentially harbor crucial information regarding the disease state. To address the prevailing demand for more refined segmentation techniques, this study introduces an innovative framework “SwinSAM”, which ingeniously integrates a Swin Transformer decoder with a SAM encoder. The SAM model has seen over a billion images and possesses a strong capability for image comprehension. However, its training data primarily originates from natural images rather than medical ones. Hence, we designed an adapter module to infuse specific medical domain information into SAM. Furthermore, due to the varying sizes and shapes of polyps, along with their high blending degree with the background, the simplistic convolutional decoder in the original SAM model struggles to accurately segment the intricate details of polyps. This prompted us to utilize the Swin Transformer as the decoder. Additionally, considering the significant shape variations of polyps, we employed a multi-scale perception fusion module to process the deep features extracted by SAM. By using convolutions with different receptive fields, we can extract information about polyps of various shapes. Finally, we optimized the network parameters through multi-level supervision. Comprehensive experiments were conducted on five commonly used polyp segmentation datasets. The results validate that our proposed method achieves good performance across datasets with different polyp backgrounds.
结肠镜成像中的息肉分割是早期检测和预防性治疗结肠直肠癌的关键程序。在促进诊断程序的过程中,关键是要实现高精度的分割,强调细粒度的细节,因为这些细节可能蕴藏着有关疾病状态的关键信息。为了满足对更精细分割技术的普遍需求,本研究引入了一个创新框架 "SwinSAM",它巧妙地将 Swin 变压器解码器与 SAM 编码器集成在一起。SAM 模型已处理过超过十亿幅图像,具有很强的图像理解能力。不过,它的训练数据主要来自自然图像而非医学图像。因此,我们设计了一个适配器模块,为 SAM 注入特定的医学领域信息。此外,由于息肉的大小和形状各不相同,与背景的融合度也很高,原始 SAM 模型中的简单卷积解码器难以准确分割息肉的复杂细节。这促使我们使用斯温变换器作为解码器。此外,考虑到息肉形状的显著变化,我们采用了多尺度感知融合模块来处理 SAM 提取的深度特征。通过使用不同感受野的卷积,我们可以提取各种形状息肉的信息。最后,我们通过多级监督优化了网络参数。我们在五个常用的息肉分割数据集上进行了综合实验。结果验证了我们提出的方法在不同息肉背景的数据集上都能取得良好的性能。
{"title":"SwinSAM: Fine-grained polyp segmentation in colonoscopy images via segment anything model integrated with a Swin Transformer decoder","authors":"Zhoushan Feng , Yuliang Zhang , Yanhong Chen , Yiyu Shi , Yu Liu , Wen Sun , Lili Du , Dunjin Chen","doi":"10.1016/j.bspc.2024.107055","DOIUrl":"10.1016/j.bspc.2024.107055","url":null,"abstract":"<div><div>Polyp segmentation in colonoscopy imagery is a critical procedure in the early detection and preemptive management of colorectal cancer. In facilitating the diagnostic procedures, it is pivotal to attain segmentation with high precision, emphasizing fine-grained details which can potentially harbor crucial information regarding the disease state. To address the prevailing demand for more refined segmentation techniques, this study introduces an innovative framework “SwinSAM”, which ingeniously integrates a Swin Transformer decoder with a SAM encoder. The SAM model has seen over a billion images and possesses a strong capability for image comprehension. However, its training data primarily originates from natural images rather than medical ones. Hence, we designed an adapter module to infuse specific medical domain information into SAM. Furthermore, due to the varying sizes and shapes of polyps, along with their high blending degree with the background, the simplistic convolutional decoder in the original SAM model struggles to accurately segment the intricate details of polyps. This prompted us to utilize the Swin Transformer as the decoder. Additionally, considering the significant shape variations of polyps, we employed a multi-scale perception fusion module to process the deep features extracted by SAM. By using convolutions with different receptive fields, we can extract information about polyps of various shapes. Finally, we optimized the network parameters through multi-level supervision. Comprehensive experiments were conducted on five commonly used polyp segmentation datasets. The results validate that our proposed method achieves good performance across datasets with different polyp backgrounds.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107055"},"PeriodicalIF":4.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15DOI: 10.1016/j.bspc.2024.107057
Vahid Safari Dehnavi, Masoud Shafiee
In recent years, significant advances have been made in biological signal processing, allowing for the control of robotic devices. This paper introduces an innovative hand rehabilitation method for improving brain-hand connectivity using a robotic hand based on cognitive robotics. The process begins by recording the user’s electroencephalogram (EEG) and electromyogram (EMG) signals while performing hand movements in two different positions. Next, a method for effective EEG and EMG channel selection is developed, followed by two algorithms for classification of various hand movement patterns. The first algorithm incorporates preprocessing, window selection, feature extraction, and machine learning algorithms. The second algorithm uses automatic feature extraction via optimized CNN-LSTM-SVM. The rehabilitation process is controlled using fractional order singular optimal control based on the identified hand movement patterns and optimal controller design. This control approach is involved in both time-invariant and also time-varying systems. A mathematical model of the constrained rehabilitation process using a robotic hand is derived using fractional order singular theory. The problem of fractional order singular optimal control is solved via a numerical-analytical approach that utilizes Hamiltonian and orthogonal polynomials. A master supervises the entire process, and adjustments are made to each component if the error exceeds a desired threshold. Finally, a simulation is conducted to demonstrate the effectiveness of the proposed method. Conclusions regarding the feasibility and potential advantages of utilizing cognitive robotics-based control for robotic hand rehabilitation are shown.
{"title":"A novel method for hands rehabilitation using optimal control of fractional order singular system and biological signals","authors":"Vahid Safari Dehnavi, Masoud Shafiee","doi":"10.1016/j.bspc.2024.107057","DOIUrl":"10.1016/j.bspc.2024.107057","url":null,"abstract":"<div><div>In recent years, significant advances have been made in biological signal processing, allowing for the control of robotic devices. This paper introduces an innovative hand rehabilitation method for improving brain-hand connectivity using a robotic hand based on cognitive robotics. The process begins by recording the user’s electroencephalogram (EEG) and electromyogram (EMG) signals while performing hand movements in two different positions. Next, a method for effective EEG and EMG channel selection is developed, followed by two algorithms for classification of various hand movement patterns. The first algorithm incorporates preprocessing, window selection, feature extraction, and machine learning algorithms. The second algorithm uses automatic feature extraction via optimized CNN-LSTM-SVM. The rehabilitation process is controlled using fractional order singular optimal control based on the identified hand movement patterns and optimal controller design. This control approach is involved in both time-invariant and also time-varying systems. A mathematical model of the constrained rehabilitation process using a robotic hand is derived using fractional order singular theory. The problem of fractional order singular optimal control is solved via a numerical-analytical approach that utilizes Hamiltonian and orthogonal polynomials. A master supervises the entire process, and adjustments are made to each component if the error exceeds a desired threshold. Finally, a simulation is conducted to demonstrate the effectiveness of the proposed method. Conclusions regarding the feasibility and potential advantages of utilizing cognitive robotics-based control for robotic hand rehabilitation are shown.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107057"},"PeriodicalIF":4.9,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1016/j.bspc.2024.107130
Zelin Qiu , Jianjun Gu , Dingding Yao , Junfeng Li , Yonghong Yan
The spatial auditory attention decoding (Sp-AAD) technology aims to determine the direction of auditory attention in multi-talker scenarios via neural recordings. Despite the success of recent Sp-AAD algorithms, their performance is hindered by trial-specific features in EEG data. This study aims to improve decoding performance against these features. Studies in neuroscience indicate that spatial auditory attention can be reflected in the topological distribution of EEG energy across different frequency bands. This insight motivates us to propose Prototype Training, a wavelet-based training method for Sp-AAD. This method constructs prototypes with enhanced energy distribution representations and reduced trial-specific characteristics, enabling the model to better capture auditory attention features. To implement prototype training, an EEGWaveNet that employs the wavelet transform of EEG is further proposed. Detailed experiments indicate that the EEGWaveNet with prototype training outperforms other competitive models on various datasets, and the effectiveness of the proposed method is also validated. As a training method independent of model architecture, prototype training offers new insights into the field of Sp-AAD. The source code is available online at: https://github.com/qiuzelinChina/PrototypeTraining.
{"title":"Enhancing spatial auditory attention decoding with wavelet-based prototype training","authors":"Zelin Qiu , Jianjun Gu , Dingding Yao , Junfeng Li , Yonghong Yan","doi":"10.1016/j.bspc.2024.107130","DOIUrl":"10.1016/j.bspc.2024.107130","url":null,"abstract":"<div><div>The spatial auditory attention decoding (Sp-AAD) technology aims to determine the direction of auditory attention in multi-talker scenarios via neural recordings. Despite the success of recent Sp-AAD algorithms, their performance is hindered by trial-specific features in EEG data. This study aims to improve decoding performance against these features. Studies in neuroscience indicate that spatial auditory attention can be reflected in the topological distribution of EEG energy across different frequency bands. This insight motivates us to propose Prototype Training, a wavelet-based training method for Sp-AAD. This method constructs prototypes with enhanced energy distribution representations and reduced trial-specific characteristics, enabling the model to better capture auditory attention features. To implement prototype training, an EEGWaveNet that employs the wavelet transform of EEG is further proposed. Detailed experiments indicate that the EEGWaveNet with prototype training outperforms other competitive models on various datasets, and the effectiveness of the proposed method is also validated. As a training method independent of model architecture, prototype training offers new insights into the field of Sp-AAD. The source code is available online at: <span><span>https://github.com/qiuzelinChina/PrototypeTraining</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107130"},"PeriodicalIF":4.9,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1016/j.bspc.2024.107177
Francesco Di Luzio, Antonello Rosato, Massimo Panella
In the context of artificial intelligence, the inherent human attribute of engaging in logical reasoning to facilitate decision-making is mirrored by the concept of explainability, which pertains to the ability of a model to provide a clear and interpretable account of how it arrived at a particular outcome. This study explores explainability techniques for binary deep neural architectures in the framework of emotion classification through video analysis. We investigate the optimization of input features to binary classifiers for emotion recognition, with face landmarks detection, using an improved version of the Integrated Gradients explainability method. The main contribution of this paper consists of the employment of an innovative explainable artificial intelligence algorithm to understand the crucial facial landmarks movements typical of emotional feeling, using this information for improving the performance of deep learning-based emotion classifiers. By means of explainability, we can optimize the number and the position of the facial landmarks used as input features for facial emotion recognition, lowering the impact of noisy landmarks and thus increasing the accuracy of the developed models. To test the effectiveness of the proposed approach, we considered a set of deep binary models for emotion classification, trained initially with a complete set of facial landmarks, which are progressively reduced basing the decision on a suitable optimization procedure. The obtained results prove the robustness of the proposed explainable approach in terms of understanding the relevance of the different facial points for the different emotions, improving the classification accuracy and diminishing the computational cost.
{"title":"An explainable fast deep neural network for emotion recognition","authors":"Francesco Di Luzio, Antonello Rosato, Massimo Panella","doi":"10.1016/j.bspc.2024.107177","DOIUrl":"10.1016/j.bspc.2024.107177","url":null,"abstract":"<div><div>In the context of artificial intelligence, the inherent human attribute of engaging in logical reasoning to facilitate decision-making is mirrored by the concept of explainability, which pertains to the ability of a model to provide a clear and interpretable account of how it arrived at a particular outcome. This study explores explainability techniques for binary deep neural architectures in the framework of emotion classification through video analysis. We investigate the optimization of input features to binary classifiers for emotion recognition, with face landmarks detection, using an improved version of the Integrated Gradients explainability method. The main contribution of this paper consists of the employment of an innovative explainable artificial intelligence algorithm to understand the crucial facial landmarks movements typical of emotional feeling, using this information for improving the performance of deep learning-based emotion classifiers. By means of explainability, we can optimize the number and the position of the facial landmarks used as input features for facial emotion recognition, lowering the impact of noisy landmarks and thus increasing the accuracy of the developed models. To test the effectiveness of the proposed approach, we considered a set of deep binary models for emotion classification, trained initially with a complete set of facial landmarks, which are progressively reduced basing the decision on a suitable optimization procedure. The obtained results prove the robustness of the proposed explainable approach in terms of understanding the relevance of the different facial points for the different emotions, improving the classification accuracy and diminishing the computational cost.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107177"},"PeriodicalIF":4.9,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1016/j.bspc.2024.107144
Wenlong Hang , Peng Dai , Chengao Pan , Shuang Liang , Qingfeng Zhang , Qiang Wu , Yukun Jin , Qiong Wang , Jing Qin
Semi-supervised learning (SSL) have shown promising results in 3D medical image segmentation by utilizing both labeled and readily available unlabeled images. Most current SSL methods predict unlabeled data under different perturbations by employing subnetworks with same architecture. Despite their progress, the homogenization of subnetworks limits the diverse predictions on both labeled and unlabeled data, thereby making it difficult for subnetworks to correct each other and giving rise to confirmation bias issue. In this paper, we introduce an SSL framework termed pseudo-label guided selective mutual learning (PLSML), which incorporates two distinct subnetworks and selectively utilizes their derived pseudo-labels for mutual supervision to mitigate the above issue. Specifically, the discrepancies of pseudo-labels from two distinct subnetworks are used to select the regions within labeled images that are prone to missegmentation. We then introduce a mutual discrepancy correction (MDC) regularization to revisit these regions. Moreover, a selective mutual pseudo supervision (SMPS) regularization is introduced to estimate the reliability of pseudo-labels of unlabeled images, and selectively leverage the more reliable pseudo-labels in the two subnetworks to supervise the other one. The integration of MDC and SMPS regularizations facilitates inter-subnetwork mutual correction, consequently mitigating confirmation bias. Extensive experiments on two 3D medical image datasets demonstrate the superiority of our PLSML as compared to state-of-the-art SSL methods. The source code is available online at https://github.com/1pca0/PLSML.
{"title":"Pseudo-label guided selective mutual learning for semi-supervised 3D medical image segmentation","authors":"Wenlong Hang , Peng Dai , Chengao Pan , Shuang Liang , Qingfeng Zhang , Qiang Wu , Yukun Jin , Qiong Wang , Jing Qin","doi":"10.1016/j.bspc.2024.107144","DOIUrl":"10.1016/j.bspc.2024.107144","url":null,"abstract":"<div><div>Semi-supervised learning (SSL) have shown promising results in 3D medical image segmentation by utilizing both labeled and readily available unlabeled images. Most current SSL methods predict unlabeled data under different perturbations by employing subnetworks with same architecture. Despite their progress, the homogenization of subnetworks limits the diverse predictions on both labeled and unlabeled data, thereby making it difficult for subnetworks to correct each other and giving rise to confirmation bias issue. In this paper, we introduce an SSL framework termed pseudo-label guided selective mutual learning (PLSML), which incorporates two distinct subnetworks and selectively utilizes their derived pseudo-labels for mutual supervision to mitigate the above issue. Specifically, the discrepancies of pseudo-labels from two distinct subnetworks are used to select the regions within labeled images that are prone to missegmentation. We then introduce a mutual discrepancy correction (MDC) regularization to revisit these regions. Moreover, a selective mutual pseudo supervision (SMPS) regularization is introduced to estimate the reliability of pseudo-labels of unlabeled images, and selectively leverage the more reliable pseudo-labels in the two subnetworks to supervise the other one. The integration of MDC and SMPS regularizations facilitates inter-subnetwork mutual correction, consequently mitigating confirmation bias. Extensive experiments on two 3D medical image datasets demonstrate the superiority of our PLSML as compared to state-of-the-art SSL methods. The source code is available online at <span><span>https://github.com/1pca0/PLSML</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107144"},"PeriodicalIF":4.9,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1016/j.bspc.2024.107142
Mohamed Elkharadly , Khaled Amin , O.M. Abo-Seida , Mina Ibrahim
A progressive neurodegenerative condition that adversely impacts motor skills, speech, and cognitive abilities is Parkinson’s disease (PD). Research has revealed that verbal impediments manifest in the early of PD, making them a potential diagnostic marker. This study introduces an innovative approach, leveraging Bayesian Optimization (BO) to optimize a fuzzy k-nearest neighbor (FKNN) model, enhancing the detection of PD. BO-FKNN was validated on a speech datasets. To comprehensively evaluate the efficacy of the proposed model, BO-FKNN was compared against five commonly used parameter optimization methods, including FKNN based on Particle Swarm Optimization, FKNN based on Genetic algorithm, FKNN based on Bat algorithm, FKNN based on Artificial Bee Colony algorithm, and FKNN based on Grid search. Moreover, to further boost the diagnostic accuracy, a hybrid feature selection method based on Pearson Correlation Coefficient (PCC) and Information Gain (IG) was employed prior to the BO-FKNN method, consequently the PCCIG-BO-FKNN was proposed. The experimental outcomes highlight the superior performance of the proposed system, boasting an impressive classification accuracy of 98.47%.
{"title":"Bayesian optimization enhanced FKNN model for Parkinson’s diagnosis","authors":"Mohamed Elkharadly , Khaled Amin , O.M. Abo-Seida , Mina Ibrahim","doi":"10.1016/j.bspc.2024.107142","DOIUrl":"10.1016/j.bspc.2024.107142","url":null,"abstract":"<div><div>A progressive neurodegenerative condition that adversely impacts motor skills, speech, and cognitive abilities is Parkinson’s disease (PD). Research has revealed that verbal impediments manifest in the early of PD, making them a potential diagnostic marker. This study introduces an innovative approach, leveraging Bayesian Optimization (BO) to optimize a fuzzy k-nearest neighbor (FKNN) model, enhancing the detection of PD. BO-FKNN was validated on a speech datasets. To comprehensively evaluate the efficacy of the proposed model, BO-FKNN was compared against five commonly used parameter optimization methods, including FKNN based on Particle Swarm Optimization, FKNN based on Genetic algorithm, FKNN based on Bat algorithm, FKNN based on Artificial Bee Colony algorithm, and FKNN based on Grid search. Moreover, to further boost the diagnostic accuracy, a hybrid feature selection method based on Pearson Correlation Coefficient (PCC) and Information Gain (IG) was employed prior to the BO-FKNN method, consequently the PCCIG-BO-FKNN was proposed. The experimental outcomes highlight the superior performance of the proposed system, boasting an impressive classification accuracy of 98.47%.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107142"},"PeriodicalIF":4.9,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1016/j.bspc.2024.107126
Okan Guder, Yasemin Cetin-Kaya
Timely detection of brain tumors is crucial for developing effective treatment strategies and improving the overall well-being of patients. We introduced an innovative approach in this work for classifying and diagnosing brain tumors with the help of magnetic resonance imaging and a deep learning model. In the proposed method, various attention mechanisms that allow the model to assign different degrees of importance to certain inputs are used, and their performances are compared. Additionally, the Particle Swarm Optimization algorithm is employed to find the optimal hyperparameter values for the Convolutional Neural Network model that incorporates attention mechanisms. A four-class public dataset from the Kaggle website was used to evaluate the effectiveness of the proposed method. A maximum accuracy of 99%, precision of 99.02%, recall of 99%, and F1 score of 99.01% were obtained on the Kaggle test dataset. In addition, to assess the model’s adaptability and robustness, salt-and-pepper noise was introduced to the same test dataset at various rates, and the models’ performance was re-evaluated. A maximum accuracy of 97.78% was obtained on the test data set with 1% noise, 95.04% on the test data set with 2% noise, and 88.10% on the test data set with 3% noise. When the results obtained are analyzed, it is concluded that the proposed model can be successfully used in brain tumor classification and can assist doctors in making diagnostic decisions.
{"title":"Optimized attention-based lightweight CNN using particle swarm optimization for brain tumor classification","authors":"Okan Guder, Yasemin Cetin-Kaya","doi":"10.1016/j.bspc.2024.107126","DOIUrl":"10.1016/j.bspc.2024.107126","url":null,"abstract":"<div><div>Timely detection of brain tumors is crucial for developing effective treatment strategies and improving the overall well-being of patients. We introduced an innovative approach in this work for classifying and diagnosing brain tumors with the help of magnetic resonance imaging and a deep learning model. In the proposed method, various attention mechanisms that allow the model to assign different degrees of importance to certain inputs are used, and their performances are compared. Additionally, the Particle Swarm Optimization algorithm is employed to find the optimal hyperparameter values for the Convolutional Neural Network model that incorporates attention mechanisms. A four-class public dataset from the Kaggle website was used to evaluate the effectiveness of the proposed method. A maximum accuracy of 99%, precision of 99.02%, recall of 99%, and F1 score of 99.01% were obtained on the Kaggle test dataset. In addition, to assess the model’s adaptability and robustness, salt-and-pepper noise was introduced to the same test dataset at various rates, and the models’ performance was re-evaluated. A maximum accuracy of 97.78% was obtained on the test data set with 1% noise, 95.04% on the test data set with 2% noise, and 88.10% on the test data set with 3% noise. When the results obtained are analyzed, it is concluded that the proposed model can be successfully used in brain tumor classification and can assist doctors in making diagnostic decisions.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107126"},"PeriodicalIF":4.9,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13DOI: 10.1016/j.bspc.2024.107099
Chengzhang Zhu , Renmao Zhang , Yalong Xiao , Beiji Zou , Zhangzheng Yang , Jianfeng Li , Xinze Li
Medical image segmentation’s accuracy is crucial for clinical analysis and diagnosis. Despite progress with U-Net-inspired models, they often underuse multi-scale convolutional layers crucial for enhancing detailing visual features and overlooking the importance of merging multi-scale features within the channel dimension to enhance decoder complexity. To address these limitations, we introduce a Multi-perspective Feature Compensation Enhanced Network (MFCNet) for medical image segmentation. Our network design is characterized by the strategic employment of dual-scale convolutional kernels at each encoder level. This synergy enables the precise capture of both granular and broader context features throughout the encoding phase. We further enhance the model by integrating a Dual-scale Channel-wise Cross-fusion Transformer (DCCT) mechanism within the skip connections. This innovation effectively integrates dual-scale features. We subsequently implemented the spatial attention (SA) mechanism to amplify the saliency areas within the dual-scale features. These enhanced features were subsequently merged with the feature map of the same level in the decoder, thereby augmenting the overall feature representation. Our proposed MFCNet has been evaluated on three distinct medical image datasets, and the experimental results demonstrate that it achieves more accurate segmentation performance and adaptability to varying target segmentation, making it more competitive compared to existing methods. The code is available at: https://github.com/zrm-code/MFCNet.
{"title":"Multi-perspective feature compensation enhanced network for medical image segmentation","authors":"Chengzhang Zhu , Renmao Zhang , Yalong Xiao , Beiji Zou , Zhangzheng Yang , Jianfeng Li , Xinze Li","doi":"10.1016/j.bspc.2024.107099","DOIUrl":"10.1016/j.bspc.2024.107099","url":null,"abstract":"<div><div>Medical image segmentation’s accuracy is crucial for clinical analysis and diagnosis. Despite progress with U-Net-inspired models, they often underuse multi-scale convolutional layers crucial for enhancing detailing visual features and overlooking the importance of merging multi-scale features within the channel dimension to enhance decoder complexity. To address these limitations, we introduce a Multi-perspective Feature Compensation Enhanced Network (MFCNet) for medical image segmentation. Our network design is characterized by the strategic employment of dual-scale convolutional kernels at each encoder level. This synergy enables the precise capture of both granular and broader context features throughout the encoding phase. We further enhance the model by integrating a Dual-scale Channel-wise Cross-fusion Transformer (DCCT) mechanism within the skip connections. This innovation effectively integrates dual-scale features. We subsequently implemented the spatial attention (SA) mechanism to amplify the saliency areas within the dual-scale features. These enhanced features were subsequently merged with the feature map of the same level in the decoder, thereby augmenting the overall feature representation. Our proposed MFCNet has been evaluated on three distinct medical image datasets, and the experimental results demonstrate that it achieves more accurate segmentation performance and adaptability to varying target segmentation, making it more competitive compared to existing methods. The code is available at: <span><span>https://github.com/zrm-code/MFCNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"100 ","pages":"Article 107099"},"PeriodicalIF":4.9,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142657661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}