The rapid explosion of medical data, exarcebated by the demands of smart healthcare, poses significant challenges for authentication and integrity verification. Moreover, the surge in cybercrime targeting healthcare data jeopardizes patient privacy, compromising both trust and diagnostic reliability. To address these concerns, we propose a robust healthcare system that integrates a kidney stone segmentation framework with a watermarking protocol tailored for Internet of Medical Things (IoMT) applications. Drawing upon patient information and biometrics, chaotic keys are generated for obfuscation and randomization, along with the watermark for integrity verification and authentication. The watermark is imperceptibly embedded into the obfuscated medical image using Singular Value Decomposition (SVD) and adaptive quantization, followed by randomization. Upon reception, successful watermark extraction and verification ensure secure access to unaltered medical data, enabling precise segmentation. To facilitate this, a ResNeXt-50 inspired encoder and attention-guided decoder are introduced within the U-Net architecture to enhance comprehensive feature learning. The effectiveness and practicality of the proposed system have been evaluated through comprehensive experiments on kidney CT scans. Comparative analysis with state-of-the-art techniques highlights its superior performance.
{"title":"Watermarking Protocol Inspired Kidney Stone Segmentation in IoMT.","authors":"Parkala Vishnu Bharadwaj Bayari, Nishtha Tomar, Gaurav Bhatnagar, Chiranjoy Chattopadhyay","doi":"10.1109/JBHI.2025.3563955","DOIUrl":"10.1109/JBHI.2025.3563955","url":null,"abstract":"<p><p>The rapid explosion of medical data, exarcebated by the demands of smart healthcare, poses significant challenges for authentication and integrity verification. Moreover, the surge in cybercrime targeting healthcare data jeopardizes patient privacy, compromising both trust and diagnostic reliability. To address these concerns, we propose a robust healthcare system that integrates a kidney stone segmentation framework with a watermarking protocol tailored for Internet of Medical Things (IoMT) applications. Drawing upon patient information and biometrics, chaotic keys are generated for obfuscation and randomization, along with the watermark for integrity verification and authentication. The watermark is imperceptibly embedded into the obfuscated medical image using Singular Value Decomposition (SVD) and adaptive quantization, followed by randomization. Upon reception, successful watermark extraction and verification ensure secure access to unaltered medical data, enabling precise segmentation. To facilitate this, a ResNeXt-50 inspired encoder and attention-guided decoder are introduced within the U-Net architecture to enhance comprehensive feature learning. The effectiveness and practicality of the proposed system have been evaluated through comprehensive experiments on kidney CT scans. Comparative analysis with state-of-the-art techniques highlights its superior performance.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"828-838"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143995632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3562364
Claudia V Brito, Pedro G Ferreira, Joao T Paulo
Breakthroughs in sequencing technologies led to an exponential growth of genomic data, providing novel biological insights and therapeutic applications. However, analyzing large amounts of sensitive data raises key data privacy concerns, specifically when the information is outsourced to untrusted third-party infrastructures for data storage and processing (e.g., cloud computing). We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. By leveraging trusted execution environments (TEEs), Gyosa allows users to confidentially delegate their GWAS analysis to untrusted infrastructures. Gyosa implements a computation partitioning scheme that reduces the computation done inside the TEEs while safeguarding the users' genomic data privacy. By integrating this security scheme in Glow, Gyosa provides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees.
{"title":"Exploiting Trusted Execution Environments and Distributed Computation for Genomic Association Tests.","authors":"Claudia V Brito, Pedro G Ferreira, Joao T Paulo","doi":"10.1109/JBHI.2025.3562364","DOIUrl":"10.1109/JBHI.2025.3562364","url":null,"abstract":"<p><p>Breakthroughs in sequencing technologies led to an exponential growth of genomic data, providing novel biological insights and therapeutic applications. However, analyzing large amounts of sensitive data raises key data privacy concerns, specifically when the information is outsourced to untrusted third-party infrastructures for data storage and processing (e.g., cloud computing). We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. By leveraging trusted execution environments (TEEs), Gyosa allows users to confidentially delegate their GWAS analysis to untrusted infrastructures. Gyosa implements a computation partitioning scheme that reduces the computation done inside the TEEs while safeguarding the users' genomic data privacy. By integrating this security scheme in Glow, Gyosa provides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"913-920"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144003141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3552455
Fangfang Zhu, Honghong Su, Ji Ding, Qichao Niu, Qi Zhao, Jianwei Shuai
The advancement of remote photoplethys-mography (rPPG) technology depends on the availability of comprehensive datasets. However, the reliance on facial features for rPPG signal acquisition poses significant privacy concerns, hindering the development of open-access datasets. This work establishes privacy protection principles for rPPG datasets and introduces the secure anonymization and encryption framework (SAEF) to address these challenges while preserving rPPG data integrity. SAEF first identifies privacy-sensitive facial regions for removal through importance and necessity analysis. The irreversible removal of these regions has an insignificant impact on signal quality, with an R-value deviation of less than 0.06 for BVP extraction and a mean absolute error (MAE) deviation of less than 0.05 for heart rate (HR) calculation. Additionally, SAEF introduces a high efficiency cascade key encryption method (CKEM), achieving encryption in 5.54 × 10-5 seconds per frame, which is over three orders of magnitude faster than other methods, and reducing approximate point correlation (APC) values to below 0.005, approaching complete randomness. These advancements significantly improve real-time video encryption performance and security. Finally, SAEF serves as a preprocessing tool for generating volunteer-friendly, open-access rPPG datasets.
{"title":"SAEF: Secure Anonymization and Encryption Framework for Open-Access Remote Photoplethysmography Datasets.","authors":"Fangfang Zhu, Honghong Su, Ji Ding, Qichao Niu, Qi Zhao, Jianwei Shuai","doi":"10.1109/JBHI.2025.3552455","DOIUrl":"10.1109/JBHI.2025.3552455","url":null,"abstract":"<p><p>The advancement of remote photoplethys-mography (rPPG) technology depends on the availability of comprehensive datasets. However, the reliance on facial features for rPPG signal acquisition poses significant privacy concerns, hindering the development of open-access datasets. This work establishes privacy protection principles for rPPG datasets and introduces the secure anonymization and encryption framework (SAEF) to address these challenges while preserving rPPG data integrity. SAEF first identifies privacy-sensitive facial regions for removal through importance and necessity analysis. The irreversible removal of these regions has an insignificant impact on signal quality, with an R-value deviation of less than 0.06 for BVP extraction and a mean absolute error (MAE) deviation of less than 0.05 for heart rate (HR) calculation. Additionally, SAEF introduces a high efficiency cascade key encryption method (CKEM), achieving encryption in 5.54 × 10<sup>-5</sup> seconds per frame, which is over three orders of magnitude faster than other methods, and reducing approximate point correlation (APC) values to below 0.005, approaching complete randomness. These advancements significantly improve real-time video encryption performance and security. Finally, SAEF serves as a preprocessing tool for generating volunteer-friendly, open-access rPPG datasets.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"879-889"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143657054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3588108
Yiqian Luo, Qiurong Chen, Fali Li, Peng Xu, Yangsong Zhang
Increasing the volume of training data can enable the auxiliary diagnostic algorithms for Autism Spectrum Disorder (ASD) to learn more accurate and stable models. However, due to the significant heterogeneity and domain shift in rs-fMRI data across different sites, the accuracy of auxiliary diagnosis remains unsatisfactory. Moreover, there has been limited exploration of multi-source domain adaptation models on ASD recognition, and many existing models lack inherent interpretability, as they do not explicitly incorporate prior neurobiological knowledge such as the hierarchical structure of functional brain networks. To address these challenges, we proposed a domain-adaptive algorithm based on hyperbolic space embedding. Hyperbolic space is naturally suited for representing the topology of complex networks such as brain functional networks. Therefore, we embedded the brain functional network into hyperbolic space and constructed the corresponding hyperbolic space community network to effectively extract latent representations. To address the heterogeneity of data across different sites and the issue of domain shift, we introduce a constraint loss function, Hyperbolic Maximum Mean Discrepancy (HMMD), to align the marginal distributions in the hyperbolic space. Additionally, we employ class prototype alignment to mitigate discrepancies in conditional distributions across domains. Experimental results indicate that the proposed algorithm achieves superior classification performance for ASD compared to baseline models, with improved robustness to multi-site heterogeneity. Specifically, our method achieves an average accuracy improvement of 4.03% . Moreover, its generalization capability is further validated through experiments conducted on extra Major Depressive Disorder (MDD) datasets.
{"title":"Multi-Site rs-fMRI Domain Alignment for Autism Spectrum Disorder Auxiliary Diagnosis Based on Hyperbolic Space.","authors":"Yiqian Luo, Qiurong Chen, Fali Li, Peng Xu, Yangsong Zhang","doi":"10.1109/JBHI.2025.3588108","DOIUrl":"10.1109/JBHI.2025.3588108","url":null,"abstract":"<p><p>Increasing the volume of training data can enable the auxiliary diagnostic algorithms for Autism Spectrum Disorder (ASD) to learn more accurate and stable models. However, due to the significant heterogeneity and domain shift in rs-fMRI data across different sites, the accuracy of auxiliary diagnosis remains unsatisfactory. Moreover, there has been limited exploration of multi-source domain adaptation models on ASD recognition, and many existing models lack inherent interpretability, as they do not explicitly incorporate prior neurobiological knowledge such as the hierarchical structure of functional brain networks. To address these challenges, we proposed a domain-adaptive algorithm based on hyperbolic space embedding. Hyperbolic space is naturally suited for representing the topology of complex networks such as brain functional networks. Therefore, we embedded the brain functional network into hyperbolic space and constructed the corresponding hyperbolic space community network to effectively extract latent representations. To address the heterogeneity of data across different sites and the issue of domain shift, we introduce a constraint loss function, Hyperbolic Maximum Mean Discrepancy (HMMD), to align the marginal distributions in the hyperbolic space. Additionally, we employ class prototype alignment to mitigate discrepancies in conditional distributions across domains. Experimental results indicate that the proposed algorithm achieves superior classification performance for ASD compared to baseline models, with improved robustness to multi-site heterogeneity. Specifically, our method achieves an average accuracy improvement of 4.03% . Moreover, its generalization capability is further validated through experiments conducted on extra Major Depressive Disorder (MDD) datasets.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1230-1243"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144636934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3583044
B AubouinPairault, M Reus, B Meyer, R Wolf, M Fiacchini, T Dang
In this paper, the problem of triggering early warning for intra-operative hypotension (IOH) is addressed. Recent studies on the Hypotension Prediction Index have demonstrated a gap between the results presented during model development and clinical evaluation. Thus, there is a need for better collaboration between data scientists and clinicians who need to agree on a common basis to evaluate those models. In this paper, we propose a comprehensive framework for IOH prediction: to address several issues inherent to the commonly used fixed-time-to-onset approach in the literature, a sliding window approach is suggested. The risk prediction problem is formalized with consistent precision-recall metrics rather than the receiver-operator characteristic. For illustration, a standard machine learning method is applied using two different datasets from non-cardiac and cardiac surgery. Training is done on a part of the non-cardiac surgery dataset and tests are performed separately on the rest of the non-cardiac dataset and cardiac dataset. Compared to a realistic clinical baseline, the proposed method achieves a significant improvement on the non-cardiac surgeries (precision of 48% compared to 32% for a recall of 28% (p$< $0.001)). For cardiac surgery, this improvement is less significant but still demonstrate the generalization of the model.
{"title":"A Comprehensive Framework for the Prediction of Intra-Operative Hypotension.","authors":"B AubouinPairault, M Reus, B Meyer, R Wolf, M Fiacchini, T Dang","doi":"10.1109/JBHI.2025.3583044","DOIUrl":"10.1109/JBHI.2025.3583044","url":null,"abstract":"<p><p>In this paper, the problem of triggering early warning for intra-operative hypotension (IOH) is addressed. Recent studies on the Hypotension Prediction Index have demonstrated a gap between the results presented during model development and clinical evaluation. Thus, there is a need for better collaboration between data scientists and clinicians who need to agree on a common basis to evaluate those models. In this paper, we propose a comprehensive framework for IOH prediction: to address several issues inherent to the commonly used fixed-time-to-onset approach in the literature, a sliding window approach is suggested. The risk prediction problem is formalized with consistent precision-recall metrics rather than the receiver-operator characteristic. For illustration, a standard machine learning method is applied using two different datasets from non-cardiac and cardiac surgery. Training is done on a part of the non-cardiac surgery dataset and tests are performed separately on the rest of the non-cardiac dataset and cardiac dataset. Compared to a realistic clinical baseline, the proposed method achieves a significant improvement on the non-cardiac surgeries (precision of 48% compared to 32% for a recall of 28% (p$< $0.001)). For cardiac surgery, this improvement is less significant but still demonstrate the generalization of the model.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1618-1629"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144484094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Semi-supervised medical image segmentation is essential for alleviating the cost of manual annotation in clinical applications. However, existing methods often suffer from unreliable pseudo-labels and confirmation bias in consistency-based training, which can lead to unstable optimization and degraded performance. To address these issues, a novel method named dual-Student adversarial framework with discriminator and consistency-driven learning for semi-supervised medical image segmentation is proposed. Specifically, an adversarial learning-based segmentation refinement (ALSR) module is designed to encourage prediction diversity between two student networks and leverage a shared discriminator for adversarial refinement of pseudo-labels. To further stabilize the consistency process, a residual exponential moving average (R-EMA) is applied in the uncertainty estimation with inter-instance consistency measurement (UIM) module to construct a robust teacher model, while noisy voxel predictions are selectively filtered based on uncertainty estimation. In addition, a Contrastive Representation Stabilization (CRS) module is developed to enhance voxel-level semantic alignment by performing contrastive learning only on confident regions, improving feature discriminability and structural consistency. Extensive experiments on benchmark datasets demonstrate that our method consistently outperforms prior state-of-the-art approaches.
{"title":"Dual-Student Adversarial Framework With Discriminator and Consistency-Driven Learning for Semi-Supervised Medical Image Segmentation.","authors":"Haifan Wu, Yuhan Geng, Di Gai, Jieying Tu, Xin Xiong, Qi Wang, Zheng Huang","doi":"10.1109/JBHI.2025.3597469","DOIUrl":"10.1109/JBHI.2025.3597469","url":null,"abstract":"<p><p>Semi-supervised medical image segmentation is essential for alleviating the cost of manual annotation in clinical applications. However, existing methods often suffer from unreliable pseudo-labels and confirmation bias in consistency-based training, which can lead to unstable optimization and degraded performance. To address these issues, a novel method named dual-Student adversarial framework with discriminator and consistency-driven learning for semi-supervised medical image segmentation is proposed. Specifically, an adversarial learning-based segmentation refinement (ALSR) module is designed to encourage prediction diversity between two student networks and leverage a shared discriminator for adversarial refinement of pseudo-labels. To further stabilize the consistency process, a residual exponential moving average (R-EMA) is applied in the uncertainty estimation with inter-instance consistency measurement (UIM) module to construct a robust teacher model, while noisy voxel predictions are selectively filtered based on uncertainty estimation. In addition, a Contrastive Representation Stabilization (CRS) module is developed to enhance voxel-level semantic alignment by performing contrastive learning only on confident regions, improving feature discriminability and structural consistency. Extensive experiments on benchmark datasets demonstrate that our method consistently outperforms prior state-of-the-art approaches.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1492-1505"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144821298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3613234
Muhammad Adil, Shahid Mumtaz, Ahmed Farouk, Houbing Song, Zhanpeng Jin
The literature repeatedly reports that the unique nature of individual brainwave patterns makes them suitable for identification and authentication, because they are difficult to replicate or forge. Therefore, many researchers have utilized brainwaves for authentication by training traditional deep learning and machine learning models. However, the internal decision processes of these black-box models have not been evaluated in terms of biases, overfitting, large training data requirements, and handling complex data structures, which keep them in a fuzzy state. To address these limitations, a smart system is needed to be develop that could be capable of making the authentication process user-friendly, robust, and reliable. In this paper, we present a deep reinforcement learning-based biometric authentication framework known as "BrainAuth" for personal identification using the gamma ($gamma$) and beta ($beta$) brainwaves. This approach improves the accuracy of authentication by using the (i) Dyna framework and a dual estimation technique. Both these technique helps to maintain the integrity of brainwave patterns, which are needed for authentication and understanding of spoofing activities. (ii) We also introduce a layered structure architecture in the proposed model to reduce the time needed for exploration using two deep neural networks. These networks work together to handle the complex data while making decisions in delay sensitive environment. (iii) We evaluate the model on seen and unseen data to verify its robustness. During analysis, the model achieved an equal error rate (EER) of $approx$ 0.07% for seen data and $approx$ 0.15% for unseen data, respectively. Furthermore, the analysis metrics such as true positive (TP), false positive (FP), true negative (TN), and false negative (FN) followed by false acceptance rate (FAR), false rejection rate (FRR), true acceptance rate (TAR) revealed significant improvements compared to existing schemes.
文献反复报道,个体脑电波模式的独特性使其适合于身份识别和认证,因为它们难以复制或伪造。因此,许多研究人员通过训练传统的深度学习和机器学习模型,利用脑电波进行身份验证。然而,这些黑箱模型的内部决策过程尚未在偏差、过拟合、大量训练数据需求和处理复杂数据结构方面进行评估,这些因素使它们处于模糊状态。为了解决这些限制,需要开发一种智能系统,使身份验证过程对用户友好、健壮和可靠。在本文中,我们提出了一种基于深度强化学习的生物识别认证框架,称为“BrainAuth”,用于使用gamma ($gamma$)和beta ($beta$)脑电波进行个人识别。该方法通过使用(1)Dyna框架和对偶估计技术提高了身份验证的准确性。这两种技术都有助于保持脑波模式的完整性,这是身份验证和理解欺骗活动所必需的。(ii)我们还在提出的模型中引入了分层结构架构,以减少使用两个深度神经网络进行探索所需的时间。这些网络协同工作以处理复杂的数据,同时在延迟敏感环境下做出决策。(iii)我们在可见和未见数据上评估模型以验证其稳健性。在分析过程中,模型的等错误率(EER)为$approx$ 0.07% for seen data and $approx$ 0.15% for unseen data, respectively. Furthermore, the analysis metrics such as true positive (TP), false positive (FP), true negative (TN), and false negative (FN) followed by false acceptance rate (FAR), false rejection rate (FRR), true acceptance rate (TAR) revealed significant improvements compared to existing schemes.
{"title":"BrainAuth: A Neuro-Biometric Approach for Personal Authentication.","authors":"Muhammad Adil, Shahid Mumtaz, Ahmed Farouk, Houbing Song, Zhanpeng Jin","doi":"10.1109/JBHI.2025.3613234","DOIUrl":"10.1109/JBHI.2025.3613234","url":null,"abstract":"<p><p>The literature repeatedly reports that the unique nature of individual brainwave patterns makes them suitable for identification and authentication, because they are difficult to replicate or forge. Therefore, many researchers have utilized brainwaves for authentication by training traditional deep learning and machine learning models. However, the internal decision processes of these black-box models have not been evaluated in terms of biases, overfitting, large training data requirements, and handling complex data structures, which keep them in a fuzzy state. To address these limitations, a smart system is needed to be develop that could be capable of making the authentication process user-friendly, robust, and reliable. In this paper, we present a deep reinforcement learning-based biometric authentication framework known as \"BrainAuth\" for personal identification using the gamma ($gamma$) and beta ($beta$) brainwaves. This approach improves the accuracy of authentication by using the (i) Dyna framework and a dual estimation technique. Both these technique helps to maintain the integrity of brainwave patterns, which are needed for authentication and understanding of spoofing activities. (ii) We also introduce a layered structure architecture in the proposed model to reduce the time needed for exploration using two deep neural networks. These networks work together to handle the complex data while making decisions in delay sensitive environment. (iii) We evaluate the model on seen and unseen data to verify its robustness. During analysis, the model achieved an equal error rate (EER) of $approx$ 0.07% for seen data and $approx$ 0.15% for unseen data, respectively. Furthermore, the analysis metrics such as true positive (TP), false positive (FP), true negative (TN), and false negative (FN) followed by false acceptance rate (FAR), false rejection rate (FRR), true acceptance rate (TAR) revealed significant improvements compared to existing schemes.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"900-912"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145137298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3594113
Bo Lu, Tiancheng Zhou, Qingbiao Li, Wenzheng Chi, Yue Wang, Yu Wang, Huicong Liu, Jia Gu, Lining Sun
Robotic endoluminal surgery has gained tremendous attention for its enhanced treatments in gastrointestinal intervention, where navigating surgeons with monocular camera-based metric depth estimation is a vital sector. However, existing methods either rely on external sensors or perform poorly in terms of visual navigation. In this work, we present our M$^{3}$-Degrees Net, a novel monocular vision-guided and graph learning-based network tailored for accurate metric marching depth (MD) estimation. We first leverage a generative model to output a scale-free depth map, providing a depth basis in a coarse granularity. To achieve an optimized and metric MD prediction, a relational graph convolutional network with multi-modal visual knowledge fusion is devised. It utilizes shared salient features between keyframes and encodes their pixel differences on the depth basis as the main node, while a projection length-based node that predicts the MD on a proportional relationship basis is introduced, aiming to enable the network with explicit depth awareness. Moreover, to compensate for rotation-induced MD estimation bias, we model the endoscope's orientation changes as image-level feature shifts, formulating an ego-motion correction node for MD optimization. Lastly, a multi-layer regression network for the metric MD estimation with finer granularity is devised. We validate our network on both public and in-house datasets, and the quantitative results reveal that it can limit the overall MD error under 27.3%, which vastly outperforms the existing methods. Besides, our M$^{3}$-Degrees Net is qualitatively tested on the in-house clinical gastrointestinal endoscopy data, demonstrating its satisfactory performance even under cavity mucus with varying reflections, indicating promising clinical potentials.
{"title":"M$^{3}$-DEGREES Net: Monocular-Guided Metric Marching Depth Estimation With Graph-Based Relevance Ensemble for Endoluminal Surgery.","authors":"Bo Lu, Tiancheng Zhou, Qingbiao Li, Wenzheng Chi, Yue Wang, Yu Wang, Huicong Liu, Jia Gu, Lining Sun","doi":"10.1109/JBHI.2025.3594113","DOIUrl":"10.1109/JBHI.2025.3594113","url":null,"abstract":"<p><p>Robotic endoluminal surgery has gained tremendous attention for its enhanced treatments in gastrointestinal intervention, where navigating surgeons with monocular camera-based metric depth estimation is a vital sector. However, existing methods either rely on external sensors or perform poorly in terms of visual navigation. In this work, we present our M$^{3}$-Degrees Net, a novel monocular vision-guided and graph learning-based network tailored for accurate metric marching depth (MD) estimation. We first leverage a generative model to output a scale-free depth map, providing a depth basis in a coarse granularity. To achieve an optimized and metric MD prediction, a relational graph convolutional network with multi-modal visual knowledge fusion is devised. It utilizes shared salient features between keyframes and encodes their pixel differences on the depth basis as the main node, while a projection length-based node that predicts the MD on a proportional relationship basis is introduced, aiming to enable the network with explicit depth awareness. Moreover, to compensate for rotation-induced MD estimation bias, we model the endoscope's orientation changes as image-level feature shifts, formulating an ego-motion correction node for MD optimization. Lastly, a multi-layer regression network for the metric MD estimation with finer granularity is devised. We validate our network on both public and in-house datasets, and the quantitative results reveal that it can limit the overall MD error under 27.3%, which vastly outperforms the existing methods. Besides, our M$^{3}$-Degrees Net is qualitatively tested on the in-house clinical gastrointestinal endoscopy data, demonstrating its satisfactory performance even under cavity mucus with varying reflections, indicating promising clinical potentials.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1244-1257"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144764854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3589889
Zijun Wei, Zhiqiang Zhang, Sheng Quan Xie
The key to achieving assist-as-needed (AAN) control in rehabilitation robots lies in accurately predicting patient motion intentions. This study, for the first time, redefines motion intention prediction from the perspective of sequence-to-sequence translation by analogizing sEMG signals and joint angles to the source language and target language, respectively. The proposed 3DCNN-TF model achieves precise translation of neural control signals into kinematic representations. This model comprises three modules: an sEMG "sentence" generation module that compiles multiple sEMG sliding windows into a "sentence," a 3DCNN module based on muscle anatomy and electrode placement to extract muscle synergy features from each "word" in the "sentence," and a Transformer (TF) module that autoregressively generates the next joint angle as the translation result. Experimental results indicate that the 3DCNN-TF model achieves superior overall performance compared to eight baseline models and existing studies in continuously predicting wrist and knee flexion/extension angles across varying speeds. Moreover, the 3DCNN-TF achieves an optimal balance between prediction accuracy and computational efficiency while exhibiting exceptional robustness and generalizability. Specifically, the 3DCNN-TF achieves average nRMSE and R2 values of (6.2% /95.5% ) and (5.5% /96.2% ) on wrist and knee datasets, respectively, with an average training time of less than two minutes. Additionally, the 3DCNN-TF can predict joint angles up to 300 ms in advance without compromising accuracy, which is critical for real-time AAN control in rehabilitation robots.
{"title":"A Transformer Framework Informed by Muscle Anatomy and Sequence-to-Sequence Translation for Continuous Joint Kinematics Prediction Using sEMG.","authors":"Zijun Wei, Zhiqiang Zhang, Sheng Quan Xie","doi":"10.1109/JBHI.2025.3589889","DOIUrl":"10.1109/JBHI.2025.3589889","url":null,"abstract":"<p><p>The key to achieving assist-as-needed (AAN) control in rehabilitation robots lies in accurately predicting patient motion intentions. This study, for the first time, redefines motion intention prediction from the perspective of sequence-to-sequence translation by analogizing sEMG signals and joint angles to the source language and target language, respectively. The proposed 3DCNN-TF model achieves precise translation of neural control signals into kinematic representations. This model comprises three modules: an sEMG \"sentence\" generation module that compiles multiple sEMG sliding windows into a \"sentence,\" a 3DCNN module based on muscle anatomy and electrode placement to extract muscle synergy features from each \"word\" in the \"sentence,\" and a Transformer (TF) module that autoregressively generates the next joint angle as the translation result. Experimental results indicate that the 3DCNN-TF model achieves superior overall performance compared to eight baseline models and existing studies in continuously predicting wrist and knee flexion/extension angles across varying speeds. Moreover, the 3DCNN-TF achieves an optimal balance between prediction accuracy and computational efficiency while exhibiting exceptional robustness and generalizability. Specifically, the 3DCNN-TF achieves average nRMSE and R<sup>2</sup> values of (6.2% /95.5% ) and (5.5% /96.2% ) on wrist and knee datasets, respectively, with an average training time of less than two minutes. Additionally, the 3DCNN-TF can predict joint angles up to 300 ms in advance without compromising accuracy, which is critical for real-time AAN control in rehabilitation robots.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1140-1152"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144682469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3632032
Xiao Ke, Yang Chen, Wenzhong Guo
The anatomical information obtained from medical image segmentation will provide a crucial decision-making basis for clinical diagnosis and treatment. Deep networks with encoder-decoder architecture proposed recently have achieved impressive results. However, these existing deep networks have some inherent flaws, e.g., network depth and downsampling operators jointly determine the loss of spatial detail information of deep features. We find that it is the lack of targeted solutions to these inherent flaws that make it difficult to further improve the segmentation performance. Therefore, based on these findings, we propose an end-to-end collaborative refinement method (CoRe). Specifically, we first design to generate an Error-Prone Region (EPR) by predicting uncertainty map and foreground boundary map to simulate the error region, and after locating pixels with high error proneness, we propose a feature refinement module (FRM) based on neighborhood-aware features and foreground-boundary-enhanced features to refine the upsampling features of the decoder, so as to better reconstruct the lost spatial detail information. In addition, a segmentation refinement module (SRM) is proposed to refine coarse segmentation prediction by establishing highly representative global class centers that comprehensively contain the intrinsic properties of each segmentation target. Finally, we conduct extensive experiments on five datasets with different modalities and segmentation targets. The results show that our method achieves significant improvements and competes favorably with current state-of-the-art methods.
{"title":"CoRe: An End-to-End Collaborative Refinement Network for Medical Image Segmentation.","authors":"Xiao Ke, Yang Chen, Wenzhong Guo","doi":"10.1109/JBHI.2025.3632032","DOIUrl":"10.1109/JBHI.2025.3632032","url":null,"abstract":"<p><p>The anatomical information obtained from medical image segmentation will provide a crucial decision-making basis for clinical diagnosis and treatment. Deep networks with encoder-decoder architecture proposed recently have achieved impressive results. However, these existing deep networks have some inherent flaws, e.g., network depth and downsampling operators jointly determine the loss of spatial detail information of deep features. We find that it is the lack of targeted solutions to these inherent flaws that make it difficult to further improve the segmentation performance. Therefore, based on these findings, we propose an end-to-end collaborative refinement method (CoRe). Specifically, we first design to generate an Error-Prone Region (EPR) by predicting uncertainty map and foreground boundary map to simulate the error region, and after locating pixels with high error proneness, we propose a feature refinement module (FRM) based on neighborhood-aware features and foreground-boundary-enhanced features to refine the upsampling features of the decoder, so as to better reconstruct the lost spatial detail information. In addition, a segmentation refinement module (SRM) is proposed to refine coarse segmentation prediction by establishing highly representative global class centers that comprehensively contain the intrinsic properties of each segmentation target. Finally, we conduct extensive experiments on five datasets with different modalities and segmentation targets. The results show that our method achieves significant improvements and competes favorably with current state-of-the-art methods.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1339-1352"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145512633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}