Pub Date : 2025-02-04DOI: 10.1016/j.bspc.2025.107556
Laura Arjona , Sergio Hernández , Girish Narayanswamy , Alfonso Bahillo , Shwetak Patel
We present AutoFlow, a Raspberry Pi-based acoustic platform that uses machine learning to autonomously detect and record voiding events. Uroflowmetry, a noninvasive diagnostic test for urinary tract function. Current uroflowmetry tests are not suitable for continuous health monitoring in a nonclinical environment because they are often distressing, costly, and burdensome for the public. To address these limitations, we developed a low-cost platform easily integrated into daily home routines. Using an acoustic dataset of home bathroom sounds, we trained and evaluated five machine learning models. The Gradient Boost model on a Raspberry Pi Zero 2 W achieved 95.63% accuracy and 0.15-second inference time. AutoFlow aims to enhance personalized healthcare at home and in areas with limited specialist access.
{"title":"Autonomous collection of voiding events for sound uroflowmetries with machine learning","authors":"Laura Arjona , Sergio Hernández , Girish Narayanswamy , Alfonso Bahillo , Shwetak Patel","doi":"10.1016/j.bspc.2025.107556","DOIUrl":"10.1016/j.bspc.2025.107556","url":null,"abstract":"<div><div>We present AutoFlow, a Raspberry Pi-based acoustic platform that uses machine learning to autonomously detect and record voiding events. Uroflowmetry, a noninvasive diagnostic test for urinary tract function. Current uroflowmetry tests are not suitable for continuous health monitoring in a nonclinical environment because they are often distressing, costly, and burdensome for the public. To address these limitations, we developed a low-cost platform easily integrated into daily home routines. Using an acoustic dataset of home bathroom sounds, we trained and evaluated five machine learning models. The Gradient Boost model on a Raspberry Pi Zero 2 W achieved 95.63% accuracy and 0.15-second inference time. AutoFlow aims to enhance personalized healthcare at home and in areas with limited specialist access.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107556"},"PeriodicalIF":4.9,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gastrointestinal (GI) diseases present a significant healthcare challenge, requiring the development of accurate and efficient diagnostic methods. Traditional diagnostic methods often use single models, which have difficulty capturing the complex and varied patterns of these conditions. To address this limitation, we propose a groundbreaking ensemble method specifically designed for the detection of GI. Our approach strategically selects three robust base models—EfficientNetB0, EfficientNetB2, and ResNet101—leveraging transfer learning to utilize their pre-trained weights and feature representations. This foundation accelerates training and enhanced the models’ capacity to discern complex patterns associated with gastrointestinal conditions. Our methodology is centered around a novel beta normalization aggregation scheme that combines insights from individual models according to their confidence scores. This refined aggregation approach culminates in a nuanced ensemble model that improves overall predictive accuracy. We rigorously evaluate our proposed method on two established gastrointestinal datasets—one comprising four classes and the other three classes—achieving exceptional accuracies of 97.88% and 97.47%, respectively. Notably, our approach not only outperforms the individual base models but also surpasses existing methodologies in gastrointestinal diagnosis. Using Grad-CAM analysis, we present visualizations of the decision-making processes in our models, which enhanced both interpretability and trustworthiness. Unlike conventional ensemble methods that utilize basic summation or other traditional strategies, our innovative weighted average strategy and improved beta normalization scheme position our ensemble method as a powerful and reliable tool for accurate gastrointestinal disease detection. This advancement holds the potential to significantly enhance diagnostic precision in the complex landscape of gastrointestinal health, offering new avenues for clinical application and improved patient outcomes.
{"title":"An ensemble approach of deep CNN models with Beta normalization aggregation for gastrointestinal disease detection","authors":"Zafran Waheed , Jinsong Gui , Kamran Amjad , Ikram Waheed , Sohaib Asif","doi":"10.1016/j.bspc.2025.107567","DOIUrl":"10.1016/j.bspc.2025.107567","url":null,"abstract":"<div><div>Gastrointestinal (GI) diseases present a significant healthcare challenge, requiring the development of accurate and efficient diagnostic methods. Traditional diagnostic methods often use single models, which have difficulty capturing the complex and varied patterns of these conditions. To address this limitation, we propose a groundbreaking ensemble method specifically designed for the detection of GI. Our approach strategically selects three robust base models—EfficientNetB0, EfficientNetB2, and ResNet101—leveraging transfer learning to utilize their pre-trained weights and feature representations. This foundation accelerates training and enhanced the models’ capacity to discern complex patterns associated with gastrointestinal conditions. Our methodology is centered around a novel beta normalization aggregation scheme that combines insights from individual models according to their confidence scores. This refined aggregation approach culminates in a nuanced ensemble model that improves overall predictive accuracy. We rigorously evaluate our proposed method on two established gastrointestinal datasets—one comprising four classes and the other three classes—achieving exceptional accuracies of 97.88% and 97.47%, respectively. Notably, our approach not only outperforms the individual base models but also surpasses existing methodologies in gastrointestinal diagnosis. Using Grad-CAM analysis, we present visualizations of the decision-making processes in our models, which enhanced both interpretability and trustworthiness. Unlike conventional ensemble methods that utilize basic summation or other traditional strategies, our innovative weighted average strategy and improved beta normalization scheme position our ensemble method as a powerful and reliable tool for accurate gastrointestinal disease detection. This advancement holds the potential to significantly enhance diagnostic precision in the complex landscape of gastrointestinal health, offering new avenues for clinical application and improved patient outcomes.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107567"},"PeriodicalIF":4.9,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cervical cancer is a significant health issue affecting women globally, with a high number of new cases and deaths reported each year. The disease is linked to HPV infection, but early detection through Pap smear tests can significantly increase performance. Deep learning techniques, particularly convolutional neural networks, transfer learning, generative adversarial networks, and attention mechanisms, are employed to identify cervical cancer. These innovative methods can increase the effectiveness and efficiency of cervical cancer screening and diagnosis. Although these technologies provide advantages for diagnosing cervical cancer, issues related to the availability and integrity of data, interpretability of models, and integration into clinical workflows exist. A computer-aided diagnostic system that uses vision transformers, a majority fusion mechanism, and explainable artificial intelligence is presented to address these challenges. This framework aims to increase cervical cancer detection accuracy and efficiency. Two cutting-edge datasets, DTU/Herlev and SIPaKMeD, are used to evaluate the system, yielding overall accuracy results of 99.22% and 99.8%, respectively. A comparison of the suggested framework with state-of-the-art methods revealed equivalent or even better results.
{"title":"Challenging the status quo: Why artificial intelligence models must go beyond accuracy in cervical cancer diagnosis","authors":"Yousry AbdulAzeem , Hossam Magdy Balaha , Hanaa ZainEldin , Waleed AbdelKarim Abuain , Mahmoud Badawy , Mostafa A. Elhosseini","doi":"10.1016/j.bspc.2025.107620","DOIUrl":"10.1016/j.bspc.2025.107620","url":null,"abstract":"<div><div>Cervical cancer is a significant health issue affecting women globally, with a high number of new cases and deaths reported each year. The disease is linked to HPV infection, but early detection through Pap smear tests can significantly increase performance. Deep learning techniques, particularly convolutional neural networks, transfer learning, generative adversarial networks, and attention mechanisms, are employed to identify cervical cancer. These innovative methods can increase the effectiveness and efficiency of cervical cancer screening and diagnosis. Although these technologies provide advantages for diagnosing cervical cancer, issues related to the availability and integrity of data, interpretability of models, and integration into clinical workflows exist. A computer-aided diagnostic system that uses vision transformers, a majority fusion mechanism, and explainable artificial intelligence is presented to address these challenges. This framework aims to increase cervical cancer detection accuracy and efficiency. Two cutting-edge datasets, DTU/Herlev and SIPaKMeD, are used to evaluate the system, yielding overall accuracy results of 99.22% and 99.8%, respectively. A comparison of the suggested framework with state-of-the-art methods revealed equivalent or even better results.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107620"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-03DOI: 10.1016/j.bspc.2025.107607
R. Niranjana , A. Ravi , J. Sivadasan
Breast cancer is statistically proven to be one of the most serious health complications worldwide. Mammography is a popular primary imaging modality. Early detection of minor abnormalities is essential in saving the patient from the agony of advanced stages of cancer. To consider a solution for this requirement, a novel Hybrid Deep Learning IEUNet++ is proposed in this paper for Multiclass Classification of Breast Mammogram images. The method uses an ensemble of top-notch Deep Learning (DL) networks, InceptionResnetV2 and EfficientNetB7 algorithms, as its encoder structure for automatic analysis of mammogram and segmenting the tumor distinctively and categorizing masses and calcifications as normal, innoxious (benign) or noxious (malignant) that can enhance the Computer Aided Diagnostic (CAD) system, profoundly patronaging the medical expert’s analysis. The proposed method is performed functional on three common public datasets: CBIS-DDSM, INBreast and MIAS datasets. The objective of this paper is “to propose an automatic scheme that can efficiently classify breast lesions of different dimensions and patterns of occurrence (masses and calcifications) with low levels of false positives and negatives regardless of the heterogeneity of the dataset involved”. The classification results attains 99.56 %, 99.72 %, 99.81 % Sensitivity, 0.907, 0.925, 0.955 Dice scores, 0.953, 0.911 and 0.956 Intersection over Union scores and 0.996, 0.997, 0.998 Specificity for CBIS-DDSM, INBreast and MIAS datasets respectively in first case and 99.87 % accuracy, 99.77 % sensitivity, 0.972 Dice, 0.941 IoU and 0.998 specificity scores in second case of classification.
{"title":"Performance analysis of novel hybrid deep learning model IEU Net++ for multiclass categorization of breast mammogram images","authors":"R. Niranjana , A. Ravi , J. Sivadasan","doi":"10.1016/j.bspc.2025.107607","DOIUrl":"10.1016/j.bspc.2025.107607","url":null,"abstract":"<div><div>Breast cancer is statistically proven to be one of the most serious health complications worldwide. Mammography is a popular primary imaging modality. Early detection of minor abnormalities is essential in saving the patient from the agony of advanced stages of cancer. To consider a solution for this requirement, a novel Hybrid Deep Learning IEUNet++ is proposed in this paper for Multiclass Classification of Breast Mammogram images. The method uses an ensemble of top-notch Deep Learning (DL) networks, InceptionResnetV2 and EfficientNetB7 algorithms, as its encoder structure for automatic analysis of mammogram and segmenting the tumor distinctively and categorizing masses and calcifications as normal, innoxious (benign) or noxious (malignant) that can enhance the Computer Aided Diagnostic (CAD) system, profoundly patronaging the medical expert’s analysis. The proposed method is performed functional on three common public datasets: CBIS-DDSM, INBreast and MIAS datasets. The objective of this paper is “to propose an automatic scheme that can efficiently classify breast lesions of different dimensions and patterns of occurrence (masses and calcifications) with low levels of false positives and negatives regardless of the heterogeneity of the dataset involved”. The classification results attains 99.56 %, 99.72 %, 99.81 % Sensitivity, 0.907, 0.925, 0.955 Dice scores, 0.953, 0.911 and 0.956 Intersection over Union scores and 0.996, 0.997, 0.998 Specificity for CBIS-DDSM, INBreast and MIAS datasets respectively in first case and 99.87 % accuracy, 99.77 % sensitivity, 0.972 Dice, 0.941 IoU and 0.998 specificity scores in second case of classification.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107607"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-03DOI: 10.1016/j.bspc.2025.107564
Xiaoyu Qiao , Weisheng Li , Bin Xiao , Yuping Huang , Lijian Yang
Score matching with Langevin dynamics (SMLD) method has been successfully applied to accelerated MRI. However, the sampling process requires subtle hand-tuning, as inaccurate hyperparameters can lead to severe hallucination artifacts, particularly with out-of-distribution test data. To address these limitations, a novel workflow is proposed in this study in which naive SMLD samples serve as additional priors to guide model-driven network training. First, a pretrained score network was adopted to generate samples as preliminary guidance images (PGI), obviating the need for network retraining, parameter tuning and in-distribution test data. Although PGIs are corrupted by hallucination artifacts, they can provide additional information through effective denoising to facilitate reconstruction. Therefore, a denoising module (DM) was designed in the second step to coarsely eliminate the artifacts and noises from PGIs. A score-based information extractor (SIE) and cross-domain information extractor (CIE) were then introduced to capture prior information and build robust mappings to denoised reconstructions. Third, a model-driven network, guided by denoised PGIs (DGIs), was designed to further recover fine details. DGIs are densely connected with intermediate reconstructions at each cascade, enriching input information and providing more accurate guidance. Experiments on different datasets indicate that, despite the low average quality of PGIs, the proposed workflow effectively extracts valuable information to guide network training, even with severely reduced training data and sampling steps. The proposed method outperformed other cutting-edge techniques by effectively mitigating hallucination artifacts, yielding robust and high-quality reconstruction results.
{"title":"Score-based generative priors-guided model-driven Network for MRI reconstruction","authors":"Xiaoyu Qiao , Weisheng Li , Bin Xiao , Yuping Huang , Lijian Yang","doi":"10.1016/j.bspc.2025.107564","DOIUrl":"10.1016/j.bspc.2025.107564","url":null,"abstract":"<div><div>Score matching with Langevin dynamics (SMLD) method has been successfully applied to accelerated MRI. However, the sampling process requires subtle hand-tuning, as inaccurate hyperparameters can lead to severe hallucination artifacts, particularly with out-of-distribution test data. To address these limitations, a novel workflow is proposed in this study in which naive SMLD samples serve as additional priors to guide model-driven network training. First, a pretrained score network was adopted to generate samples as preliminary guidance images (PGI), obviating the need for network retraining, parameter tuning and in-distribution test data. Although PGIs are corrupted by hallucination artifacts, they can provide additional information through effective denoising to facilitate reconstruction. Therefore, a denoising module (DM) was designed in the second step to coarsely eliminate the artifacts and noises from PGIs. A score-based information extractor (SIE) and cross-domain information extractor (CIE) were then introduced to capture prior information and build robust mappings to denoised reconstructions. Third, a model-driven network, guided by denoised PGIs (DGIs), was designed to further recover fine details. DGIs are densely connected with intermediate reconstructions at each cascade, enriching input information and providing more accurate guidance. Experiments on different datasets indicate that, despite the low average quality of PGIs, the proposed workflow effectively extracts valuable information to guide network training, even with severely reduced training data and sampling steps. The proposed method outperformed other cutting-edge techniques by effectively mitigating hallucination artifacts, yielding robust and high-quality reconstruction results.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107564"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-03DOI: 10.1016/j.bspc.2025.107585
Sungjun Lim , Taero Kim , Hyeonjeong Lee , Yewon Kim , Minhoi Park , Kwang-Yong Kim , Minseong Kim , Kyu Hyung Kim , Jiyoung Jung , Kyungwoo Song
Machine learning-based estimation of blood pressure (BP) using photoplethysmography (PPG) signals has gained significant attention for its non-invasive nature and potential for continuous monitoring. However, challenges remain in real-world applications, where performance can vary widely across different BP groups, especially among high-risk groups. This study is the first to propose a PPG-based BP estimation approach that specifically accounts for BP group disparities, aiming to improve robustness for high-risk BP groups.We present a comprehensive approach from the perspectives of data, model, and loss to enhance overall accuracy and reduce performance degradation for specific groups, referred to as “worst groups.” At the data level, we introduce in-group augmentation using Time-Cutmix to mitigate group imbalance severity. From a model perspective, we adopt a hybrid structure of convolutional and Transformer layers to integrate local and global information, improving average model performance. Additionally, we propose robust optimization techniques that consider data quantity and label distributions within each group. These methods effectively minimize performance loss for high-risk groups without compromising average and worst-group performance. Experimental results demonstrate the effectiveness of our methods in developing a robust BP estimation model tailored to handle group-based performance disparities.
{"title":"Robust optimization for PPG-based blood pressure estimation","authors":"Sungjun Lim , Taero Kim , Hyeonjeong Lee , Yewon Kim , Minhoi Park , Kwang-Yong Kim , Minseong Kim , Kyu Hyung Kim , Jiyoung Jung , Kyungwoo Song","doi":"10.1016/j.bspc.2025.107585","DOIUrl":"10.1016/j.bspc.2025.107585","url":null,"abstract":"<div><div>Machine learning-based estimation of blood pressure (BP) using photoplethysmography (PPG) signals has gained significant attention for its non-invasive nature and potential for continuous monitoring. However, challenges remain in real-world applications, where performance can vary widely across different BP groups, especially among high-risk groups. This study is the first to propose a PPG-based BP estimation approach that specifically accounts for BP group disparities, aiming to improve robustness for high-risk BP groups.We present a comprehensive approach from the perspectives of data, model, and loss to enhance overall accuracy and reduce performance degradation for specific groups, referred to as “worst groups.” At the data level, we introduce in-group augmentation using Time-Cutmix to mitigate group imbalance severity. From a model perspective, we adopt a hybrid structure of convolutional and Transformer layers to integrate local and global information, improving average model performance. Additionally, we propose robust optimization techniques that consider data quantity and label distributions within each group. These methods effectively minimize performance loss for high-risk groups without compromising average and worst-group performance. Experimental results demonstrate the effectiveness of our methods in developing a robust BP estimation model tailored to handle group-based performance disparities.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107585"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-03DOI: 10.1016/j.bspc.2025.107514
Jie Lin , Dongdong Wu , Lipai Huang
Early prediction can assist in diagnosis and slow the progression of brain diseases. As the disease progresses, patients with brain diseases experience cerebral atrophy, and existing brain disease prediction methods based on structural MRI utilize manually extracted morphological change features. Due to the frequent occurrence of missing data in longitudinal MRI sequences and the scarcity of densely annotated atrophy information in existing longitudinal MRI datasets, supervised learning for brain atrophy is challenging. This paper proposes an automated method for learning morphological changes in MRI over the course of a disease, named BM-GAN. It employs a self-supervised approach that jointly learns the brain’s non-rigid deformation over time during the interpolation process and guides the interpolation generator through a bidirectional mapping module to produce missing MRIs consistent with disease progression. BM-GAN generates complete MRI sequences for the ADNI and OASIS dataset, and experimental results show competitive performance on image quality metrics. Moreover, existing disease classification methods based on SVM/CNN/3DCNN have seen an improvement in precision by 6.21% to 16% for AD/NC classification and 7.34% to 21.25% for AD/MCI/NC classification after using synthetic data generated by BM-GAN. Visual results indicate that BM-GAN can generate MRIs consistent with the brain atrophy trend of Alzheimer’s disease, thereby facilitating the prediction of brain diseases.
{"title":"Self-supervised bi-directional mapping generative adversarial network for arbitrary-time longitudinal interpolation of missing data","authors":"Jie Lin , Dongdong Wu , Lipai Huang","doi":"10.1016/j.bspc.2025.107514","DOIUrl":"10.1016/j.bspc.2025.107514","url":null,"abstract":"<div><div>Early prediction can assist in diagnosis and slow the progression of brain diseases. As the disease progresses, patients with brain diseases experience cerebral atrophy, and existing brain disease prediction methods based on structural MRI utilize manually extracted morphological change features. Due to the frequent occurrence of missing data in longitudinal MRI sequences and the scarcity of densely annotated atrophy information in existing longitudinal MRI datasets, supervised learning for brain atrophy is challenging. This paper proposes an automated method for learning morphological changes in MRI over the course of a disease, named BM-GAN. It employs a self-supervised approach that jointly learns the brain’s non-rigid deformation over time during the interpolation process and guides the interpolation generator through a bidirectional mapping module to produce missing MRIs consistent with disease progression. BM-GAN generates complete MRI sequences for the ADNI and OASIS dataset, and experimental results show competitive performance on image quality metrics. Moreover, existing disease classification methods based on SVM/CNN/3DCNN have seen an improvement in precision by 6.21% to 16% for AD/NC classification and 7.34% to 21.25% for AD/MCI/NC classification after using synthetic data generated by BM-GAN. Visual results indicate that BM-GAN can generate MRIs consistent with the brain atrophy trend of Alzheimer’s disease, thereby facilitating the prediction of brain diseases.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107514"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143146737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-03DOI: 10.1016/j.bspc.2025.107603
Ruoyu Meng , Chunxiao Chen , Ming Lu , Xue Fu , Yueyue Xiao , Kunpeng Wang , Yuan Zou , Yang Li
A robust and efficient two-dimensional/three-dimensional (2D/3D) registration algorithm is critical to image-guided interventions, as it allows for intuitive, reproducible, and high-accuracy robot-assisted surgical procedures at competitive costs. Our approach adopts a multi-stage, self-supervised framework tailored to patient-specific contexts to tackle the 2D/3D registration problem. Preoperatively, a regression neural network is trained using synthesized X-rays to achieve robust initialization of rigid-body poses from the Special Euclidean group . However, lacks a bi-invariant metric for measuring the distance between poses as it is not a direct product of compact and abelian groups. At the same time, existing left-invariant metrics fail to sufficiently account for the consistency and symmetry of spatial displacements under Lie group operations, which may hinder the network from correctly comprehending the 2D/3D projective geometry. To address these limitations, we propose a practical pose parameterization approach that embeds naïve pose elements into the four-dimensional Special Orthogonal group , thereby deriving an approximate bi-invariant metric for network training. Additionally, we present a cross-stage partial style, lightweight network CSP-ConvNeXt towards low-cost systematic solutions. Intraoperatively, we perform gradient-based optimization for real-time pose refinement. We report mean target registration error, network registration success rate, and sub-millimeter registration success rate for stage-dependent evaluations. Experimental results demonstrate that our method achieves state-of-the-art registration performance on two public datasets and one in-house dataset across all registration stages.
{"title":"Parametric Bi-invariant Learning for Improved Precision in 2D/3D Image Registration","authors":"Ruoyu Meng , Chunxiao Chen , Ming Lu , Xue Fu , Yueyue Xiao , Kunpeng Wang , Yuan Zou , Yang Li","doi":"10.1016/j.bspc.2025.107603","DOIUrl":"10.1016/j.bspc.2025.107603","url":null,"abstract":"<div><div>A robust and efficient two-dimensional/three-dimensional (2D/3D) registration algorithm is critical to image-guided interventions, as it allows for intuitive, reproducible, and high-accuracy robot-assisted surgical procedures at competitive costs. Our approach adopts a multi-stage, self-supervised framework tailored to patient-specific contexts to tackle the 2D/3D registration problem. Preoperatively, a regression neural network is trained using synthesized X-rays to achieve robust initialization of rigid-body poses from the Special Euclidean group <span><math><mrow><mi>S</mi><mi>E</mi><mo>(</mo><mn>3</mn><mo>)</mo></mrow></math></span>. However, <span><math><mrow><mi>S</mi><mi>E</mi><mo>(</mo><mn>3</mn><mo>)</mo></mrow></math></span> lacks a bi-invariant metric for measuring the distance between poses as it is not a direct product of compact and abelian groups. At the same time, existing left-invariant metrics fail to sufficiently account for the consistency and symmetry of spatial displacements under Lie group operations, which may hinder the network from correctly comprehending the 2D/3D projective geometry. To address these limitations, we propose a practical pose parameterization approach that embeds naïve <span><math><mrow><mi>S</mi><mi>E</mi><mo>(</mo><mn>3</mn><mo>)</mo></mrow></math></span> pose elements into the four-dimensional Special Orthogonal group <span><math><mrow><mi>S</mi><mi>O</mi><mo>(</mo><mn>4</mn><mo>)</mo></mrow></math></span>, thereby deriving an approximate bi-invariant metric for network training. Additionally, we present a <em>cross-stage partial</em> style, lightweight network CSP-ConvNeXt towards low-cost systematic solutions. Intraoperatively, we perform gradient-based optimization for real-time pose refinement. We report mean target registration error, network registration success rate, and sub-millimeter registration success rate for stage-dependent evaluations. Experimental results demonstrate that our method achieves state-of-the-art registration performance on two public datasets and one in-house dataset across all registration stages.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107603"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143146738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The automatic generation of chest X-ray diagnostic reports can alleviate the workload of radiologists and reduce the probability of misdiagnosis and missed diagnosis. However, the subtle visual differences between diseases, imbalanced sample distributions, and the use of medical terminology pose significant challenges for the automatic generation of medical reports. To address these challenges, this paper proposes a chest X-ray diagnostic report generation model based on multi-modal granularity feature fusion (MMG). During the encoding stage, the Swin Transformer is used to capture both global coarse-grained and local fine-grained features of medical images, integrating multi-granularity features to retain overall image information while enhancing the model’s ability to capture minute lesion details. Then, by utilizing the multi-head class-specific residual attention mechanism (MH-CSRA) weighted label word embedding vectors are generated based on the fused features to capture the associative information between medical terminology and visual features. Additionally, the BioBERT pre-trained model is employed to extract the semantic features of the patient’s medical history, providing necessary background information for the model. During the decoding stage, distilGPT2 is used to perform multimodal feature fusion on the visual features generated by the encoder, label word embedding vectors, and semantic features of the medical history, and to generate diagnostic reports. To balance the model’s attention to imbalanced labels, improvements have been made to the binary cross-entropy function (BCE) and the categorical cross-entropy loss function(CE), proposing the polynomial expansion loss functions PBL and PCL, which effectively enhance the accuracy and fluency of the generated reports. Experimental results show that on the IU-Xray and MIMIC-CXR public datasets, the MMG model achieves superior results compared to existing methods in evaluation metrics such as BLEU, ROUGE-L, and METEOR.
{"title":"Automated generation of chest X-ray imaging diagnostic reports by multimodal and multi granularity features fusion","authors":"Junze Fang, Suxia Xing, Kexian Li, Zheng Guo, Ge Li, Chongchong Yu","doi":"10.1016/j.bspc.2025.107562","DOIUrl":"10.1016/j.bspc.2025.107562","url":null,"abstract":"<div><div>The automatic generation of chest X-ray diagnostic reports can alleviate the workload of radiologists and reduce the probability of misdiagnosis and missed diagnosis. However, the subtle visual differences between diseases, imbalanced sample distributions, and the use of medical terminology pose significant challenges for the automatic generation of medical reports. To address these challenges, this paper proposes a chest X-ray diagnostic report generation model based on multi-modal granularity feature fusion (MMG). During the encoding stage, the Swin Transformer is used to capture both global coarse-grained and local fine-grained features of medical images, integrating multi-granularity features to retain overall image information while enhancing the model’s ability to capture minute lesion details. Then, by utilizing the multi-head class-specific residual attention mechanism (MH-CSRA) weighted label word embedding vectors are generated based on the fused features to capture the associative information between medical terminology and visual features. Additionally, the BioBERT pre-trained model is employed to extract the semantic features of the patient’s medical history, providing necessary background information for the model. During the decoding stage, distilGPT2 is used to perform multimodal feature fusion on the visual features generated by the encoder, label word embedding vectors, and semantic features of the medical history, and to generate diagnostic reports. To balance the model’s attention to imbalanced labels, improvements have been made to the binary cross-entropy function (BCE) and the categorical cross-entropy loss function(CE), proposing the polynomial expansion loss functions PBL and PCL, which effectively enhance the accuracy and fluency of the generated reports. Experimental results show that on the IU-Xray and MIMIC-CXR public datasets, the MMG model achieves superior results compared to existing methods in evaluation metrics such as BLEU, ROUGE-L, and METEOR.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107562"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143146787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-03DOI: 10.1016/j.bspc.2025.107542
Kun Zhang , Qianru Yu , Yansheng Liu , Yumeng Duan , Yingying Lou , Weichao Xu
Objective:
Ulcerative Colitis (UC) is a chronic inflammatory bowel disease, and its diagnosis and evaluation mainly rely on colonoscopy. Aiming to improve the accuracy of UC early diagnosis, this study proposes a novel deep learning model, Attention-Focused Refinement (AFR), to assist in the accurate classification of UC colonoscopy images.
Methods:
The AFR model combines advanced attention mechanisms and feature refinement technology to enhance the classification performance of UC enteroscopy images through a self-supervised learning strategy and a multi-module integration method. The model design takes into account the characteristics of UC colonoscopy images, focuses on analyzing the subtle features of the lesion area, and reduces sensitivity to interference factors. The model was trained on the UC colonoscopy image dataset and evaluated against existing models.
Results:
The experimental results demonstrated that the AFR model exhibited high accuracy in the four Mayo score categories (0 points normal, 1 points mild, 2 points moderate, and 3 points severe) of UC colonoscopy images, with accuracies of 0.996, 0.992, 0.972, and 0.994, respectively. These results were validated on independent datasets, showcasing the reliability and effectiveness of the AFR model in various clinical settings. Ablation experiments and comparative analyses with other state-of-the-art models further confirm the applicability and stability of the AFR model in classification tasks.
Conclusion:
The AFR model shows promising results in UC colonoscopy image classification task, validating its effectiveness as an auxiliary diagnostic tool. This model introduces new possibilities to enhance the accuracy and efficiency of UC diagnosis.
{"title":"AFR: An image-aided diagnostic approach for ulcerative colitis","authors":"Kun Zhang , Qianru Yu , Yansheng Liu , Yumeng Duan , Yingying Lou , Weichao Xu","doi":"10.1016/j.bspc.2025.107542","DOIUrl":"10.1016/j.bspc.2025.107542","url":null,"abstract":"<div><h3>Objective:</h3><div>Ulcerative Colitis (UC) is a chronic inflammatory bowel disease, and its diagnosis and evaluation mainly rely on colonoscopy. Aiming to improve the accuracy of UC early diagnosis, this study proposes a novel deep learning model, Attention-Focused Refinement (AFR), to assist in the accurate classification of UC colonoscopy images.</div></div><div><h3>Methods:</h3><div>The AFR model combines advanced attention mechanisms and feature refinement technology to enhance the classification performance of UC enteroscopy images through a self-supervised learning strategy and a multi-module integration method. The model design takes into account the characteristics of UC colonoscopy images, focuses on analyzing the subtle features of the lesion area, and reduces sensitivity to interference factors. The model was trained on the UC colonoscopy image dataset and evaluated against existing models.</div></div><div><h3>Results:</h3><div>The experimental results demonstrated that the AFR model exhibited high accuracy in the four Mayo score categories (0 points normal, 1 points mild, 2 points moderate, and 3 points severe) of UC colonoscopy images, with accuracies of 0.996, 0.992, 0.972, and 0.994, respectively. These results were validated on independent datasets, showcasing the reliability and effectiveness of the AFR model in various clinical settings. Ablation experiments and comparative analyses with other state-of-the-art models further confirm the applicability and stability of the AFR model in classification tasks.</div></div><div><h3>Conclusion:</h3><div>The AFR model shows promising results in UC colonoscopy image classification task, validating its effectiveness as an auxiliary diagnostic tool. This model introduces new possibilities to enhance the accuracy and efficiency of UC diagnosis.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107542"},"PeriodicalIF":4.9,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143147468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}