Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3632032
Xiao Ke, Yang Chen, Wenzhong Guo
The anatomical information obtained from medical image segmentation will provide a crucial decision-making basis for clinical diagnosis and treatment. Deep networks with encoder-decoder architecture proposed recently have achieved impressive results. However, these existing deep networks have some inherent flaws, e.g., network depth and downsampling operators jointly determine the loss of spatial detail information of deep features. We find that it is the lack of targeted solutions to these inherent flaws that make it difficult to further improve the segmentation performance. Therefore, based on these findings, we propose an end-to-end collaborative refinement method (CoRe). Specifically, we first design to generate an Error-Prone Region (EPR) by predicting uncertainty map and foreground boundary map to simulate the error region, and after locating pixels with high error proneness, we propose a feature refinement module (FRM) based on neighborhood-aware features and foreground-boundary-enhanced features to refine the upsampling features of the decoder, so as to better reconstruct the lost spatial detail information. In addition, a segmentation refinement module (SRM) is proposed to refine coarse segmentation prediction by establishing highly representative global class centers that comprehensively contain the intrinsic properties of each segmentation target. Finally, we conduct extensive experiments on five datasets with different modalities and segmentation targets. The results show that our method achieves significant improvements and competes favorably with current state-of-the-art methods.
{"title":"CoRe: An End-to-End Collaborative Refinement Network for Medical Image Segmentation.","authors":"Xiao Ke, Yang Chen, Wenzhong Guo","doi":"10.1109/JBHI.2025.3632032","DOIUrl":"10.1109/JBHI.2025.3632032","url":null,"abstract":"<p><p>The anatomical information obtained from medical image segmentation will provide a crucial decision-making basis for clinical diagnosis and treatment. Deep networks with encoder-decoder architecture proposed recently have achieved impressive results. However, these existing deep networks have some inherent flaws, e.g., network depth and downsampling operators jointly determine the loss of spatial detail information of deep features. We find that it is the lack of targeted solutions to these inherent flaws that make it difficult to further improve the segmentation performance. Therefore, based on these findings, we propose an end-to-end collaborative refinement method (CoRe). Specifically, we first design to generate an Error-Prone Region (EPR) by predicting uncertainty map and foreground boundary map to simulate the error region, and after locating pixels with high error proneness, we propose a feature refinement module (FRM) based on neighborhood-aware features and foreground-boundary-enhanced features to refine the upsampling features of the decoder, so as to better reconstruct the lost spatial detail information. In addition, a segmentation refinement module (SRM) is proposed to refine coarse segmentation prediction by establishing highly representative global class centers that comprehensively contain the intrinsic properties of each segmentation target. Finally, we conduct extensive experiments on five datasets with different modalities and segmentation targets. The results show that our method achieves significant improvements and competes favorably with current state-of-the-art methods.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1339-1352"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145512633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2024.3523532
Lang Zhang, Jinling He, Wang Li, Dong Liang, Yanjie Zhu
Magnetic resonance diffusion tensor imaging (DTI) is a unique non-invasive technique for measuring in vivo water molecule diffusion, reflecting tissue microstructure. However, acquiring high-quality DTI typically requires numerous diffusion-weighted images (DWIs) in multiple directions, resulting in long scan times that restrict its use in clinical and research settings. To address this limitation, we propose Diff-DTI, a fast DTI processing framework based on a feature-enhanced joint diffusion model, to reduce the number of DWIs needed for tensor fitting. Diff-DTI models the joint probability distribution of DWIs and DTI maps, supporting guided generation during inference. The incorporated feature enhancement fusion module further enhances image precision and details generated by the diffusion model. Experiments were performed on three public DWI datasets. Results demonstrate that Diff-DTI achieves up to 10-fold acceleration (using 6 DWIs) while maintaining relatively low normalized mean square error (NMSE) for DTI maps (2.89% for FA, 0.89% for MD, 0.95% for AD, and 0.98% for RD). Even using Diff-DTI with only 3 DWIs, the NMSEs of the generated DTI maps showed a gradual decrease, with 3.51% for FA, 0.89% for MD, 1.13% for AD, and 1.10% for RD. We conclude that Diff-DTI can significantly reduce the number of acquired DWIs and the scan time, without compromising image quality too much.
{"title":"Diff-DTI: Fast Diffusion Tensor Imaging Using A Feature-Enhanced Joint Diffusion Model.","authors":"Lang Zhang, Jinling He, Wang Li, Dong Liang, Yanjie Zhu","doi":"10.1109/JBHI.2024.3523532","DOIUrl":"10.1109/JBHI.2024.3523532","url":null,"abstract":"<p><p>Magnetic resonance diffusion tensor imaging (DTI) is a unique non-invasive technique for measuring in vivo water molecule diffusion, reflecting tissue microstructure. However, acquiring high-quality DTI typically requires numerous diffusion-weighted images (DWIs) in multiple directions, resulting in long scan times that restrict its use in clinical and research settings. To address this limitation, we propose Diff-DTI, a fast DTI processing framework based on a feature-enhanced joint diffusion model, to reduce the number of DWIs needed for tensor fitting. Diff-DTI models the joint probability distribution of DWIs and DTI maps, supporting guided generation during inference. The incorporated feature enhancement fusion module further enhances image precision and details generated by the diffusion model. Experiments were performed on three public DWI datasets. Results demonstrate that Diff-DTI achieves up to 10-fold acceleration (using 6 DWIs) while maintaining relatively low normalized mean square error (NMSE) for DTI maps (2.89% for FA, 0.89% for MD, 0.95% for AD, and 0.98% for RD). Even using Diff-DTI with only 3 DWIs, the NMSEs of the generated DTI maps showed a gradual decrease, with 3.51% for FA, 0.89% for MD, 1.13% for AD, and 1.10% for RD. We conclude that Diff-DTI can significantly reduce the number of acquired DWIs and the scan time, without compromising image quality too much.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1300-1313"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3597054
Muhammad Salman Haleem, Vasilis Aidonis, Eleni I Georga, Maria Krini, Maria Matsangidou, Angelos P Kassianos, Constantinos S Pattichis, Miguel Rujas, Laura Lopez-Perez, Giuseppe Fico, Leandro Pecchia, Dimitrios I Fotiadis, Gatekeeper Consortium
Monitoring of advanced cancer patients' health, treatment, and supportive care is essential for improving cancer survival outcomes. Traditionally, oncology has relied on clinical metrics such as survival rates, time to disease progression, and clinician-assessed toxicities. In recent years, patient-reported outcome measures (PROMs) have provided a complementary perspective, offering insights into patients' health-related quality of life (HRQoL). However, collecting PROMs consistently requires frequent clinical assessments, creating important logistical challenges. Wearable devices combined with artificial intelligence (AI) present an innovative solution for continuous, real-time HRQoL monitoring. While deep learning models effectively capture temporal patterns in physiological data, most existing approaches are unimodal, limiting their ability to address patient heterogeneity and complexity. This study introduces a multimodal deep learning approach to estimate HRQoL in advanced cancer patients. Physiological data, such as heart rate and sleep quality collected via wearable devices, are analyzed using a hybrid model combining convolutional neural networks (CNNs) and bidirectional long short-term memory (BiLSTM) networks with an attention mechanism. The BiLSTM extracts temporal dynamics, while the attention mechanism highlights key features, and CNNs detect localized patterns. PROMs, including the Hospital Anxiety and Depression Scale (HADS) and the Integrated Palliative Care Outcome Scale (IPOS), are processed through a parallel neural network before being integrated into the physiological data pipeline. The proposed model was validated with data from 204 patients over 42 days, achieving a mean absolute percentage error (MAPE) of 0.24 in HRQoL prediction. These results demonstrate the potential of combining wearable data and PROMs to improve advanced cancer care.
{"title":"A Multimodal Deep Learning Architecture for Estimating Quality of Life for Advanced Cancer Patients Based on Wearable Devices and Patient-Reported Outcome Measures.","authors":"Muhammad Salman Haleem, Vasilis Aidonis, Eleni I Georga, Maria Krini, Maria Matsangidou, Angelos P Kassianos, Constantinos S Pattichis, Miguel Rujas, Laura Lopez-Perez, Giuseppe Fico, Leandro Pecchia, Dimitrios I Fotiadis, Gatekeeper Consortium","doi":"10.1109/JBHI.2025.3597054","DOIUrl":"10.1109/JBHI.2025.3597054","url":null,"abstract":"<p><p>Monitoring of advanced cancer patients' health, treatment, and supportive care is essential for improving cancer survival outcomes. Traditionally, oncology has relied on clinical metrics such as survival rates, time to disease progression, and clinician-assessed toxicities. In recent years, patient-reported outcome measures (PROMs) have provided a complementary perspective, offering insights into patients' health-related quality of life (HRQoL). However, collecting PROMs consistently requires frequent clinical assessments, creating important logistical challenges. Wearable devices combined with artificial intelligence (AI) present an innovative solution for continuous, real-time HRQoL monitoring. While deep learning models effectively capture temporal patterns in physiological data, most existing approaches are unimodal, limiting their ability to address patient heterogeneity and complexity. This study introduces a multimodal deep learning approach to estimate HRQoL in advanced cancer patients. Physiological data, such as heart rate and sleep quality collected via wearable devices, are analyzed using a hybrid model combining convolutional neural networks (CNNs) and bidirectional long short-term memory (BiLSTM) networks with an attention mechanism. The BiLSTM extracts temporal dynamics, while the attention mechanism highlights key features, and CNNs detect localized patterns. PROMs, including the Hospital Anxiety and Depression Scale (HADS) and the Integrated Palliative Care Outcome Scale (IPOS), are processed through a parallel neural network before being integrated into the physiological data pipeline. The proposed model was validated with data from 204 patients over 42 days, achieving a mean absolute percentage error (MAPE) of 0.24 in HRQoL prediction. These results demonstrate the potential of combining wearable data and PROMs to improve advanced cancer care.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1166-1177"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144821295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3594014
Rongqian Zhang, Guanwen Xie, Jie Ying, Zhongsheng Hua
Parkinson's Disease (PD) treatment is challenging due to symptom heterogeneity and the lack of a definitive cure. Lifelong medication requires personalized treatment plans developed by physicians, but such approaches are constrained by high costs and limited physician capacity. Although deep learning (DL) methods have been explored, they lack interpretability and are restricted to numerical data inputs. In this study, we propose a novel framework that leverages large language models (LLMs) to design personalized PD treatment strategies, integrating both patient information in natural language form and external textual knowledge sources (e.g., medical guidelines). To enhance effectiveness, we use Monte Carlo Tree Search (MCTS) to refine strategies and establish a robust medication recommendation dataset. To enhance reliability and interpretability, we incorporate Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) reasoning within the LLM system, ensuring that each proposed strategy is accompanied by step-by-step explanations and references to similar historical cases. Experimental evaluations using the Parkinson's Progression Marking Initiative (PPMI) dataset show that our method surpasses physician-prescribed treatments, achieving an average reduction of over 1.4 points in the revised unified Parkinson's disease rating scale part III (MDS-UPDRS-III) scores. Our method also outperforms the RL-method by 1.01 points on average. Furthermore, over 43% of patients achieve more than 2 point-reduction of MDS-UPDRS-III scores. A detailed case study highlights the flexibility of LLMs in dynamically adjusting medication plans for patients at different disease stages, highlighting its potential to advance personalized PD management in real-world settings.
{"title":"Leveraging Large Language Models for Personalized Parkinson's Disease Treatment.","authors":"Rongqian Zhang, Guanwen Xie, Jie Ying, Zhongsheng Hua","doi":"10.1109/JBHI.2025.3594014","DOIUrl":"10.1109/JBHI.2025.3594014","url":null,"abstract":"<p><p>Parkinson's Disease (PD) treatment is challenging due to symptom heterogeneity and the lack of a definitive cure. Lifelong medication requires personalized treatment plans developed by physicians, but such approaches are constrained by high costs and limited physician capacity. Although deep learning (DL) methods have been explored, they lack interpretability and are restricted to numerical data inputs. In this study, we propose a novel framework that leverages large language models (LLMs) to design personalized PD treatment strategies, integrating both patient information in natural language form and external textual knowledge sources (e.g., medical guidelines). To enhance effectiveness, we use Monte Carlo Tree Search (MCTS) to refine strategies and establish a robust medication recommendation dataset. To enhance reliability and interpretability, we incorporate Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) reasoning within the LLM system, ensuring that each proposed strategy is accompanied by step-by-step explanations and references to similar historical cases. Experimental evaluations using the Parkinson's Progression Marking Initiative (PPMI) dataset show that our method surpasses physician-prescribed treatments, achieving an average reduction of over 1.4 points in the revised unified Parkinson's disease rating scale part III (MDS-UPDRS-III) scores. Our method also outperforms the RL-method by 1.01 points on average. Furthermore, over 43% of patients achieve more than 2 point-reduction of MDS-UPDRS-III scores. A detailed case study highlights the flexibility of LLMs in dynamically adjusting medication plans for patients at different disease stages, highlighting its potential to advance personalized PD management in real-world settings.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1693-1706"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144764853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Domainadaptation has proven effective for suppressing the inter-subject variability problem in cross-subject EEG classification tasks in which labeled data is available for source subjects while only unlabeled data is provided for target subjects. Existing domain adaptation methods typically reduced the distribution discrepancy between source and target domains by directly utilizing source domain samples or features. To safeguard the privacy of source domain data, we propose to construct a Proxy Domain by simultaneously considering the prediction Consistency and Confidence (PDCC) of locally trained source models on target EEG samples, serving as the substitute to the source domain. The framework commences with the augmentation and alignment of the source domain data to enhance feature generalizability, after which source models are trained independently on each source subject's data in a decentralized manner. Knowledge transfer from source to target domains is achieved exclusively through accessing to the source domain model, enabling the PDCC-based proxy domain construction that encapsulates the source knowledge. Finally, domain adaptation is performed using the proxy domain and target domain. As a result, PDCC eliminates the need to access source domain data while effectively leveraging source knowledge. Experimental results on four benchmark EEG datasets demonstrate that PDCC consistently outperforms eleven existing methods, including several advanced transfer learning and source-free methods. Especially, the effectiveness of the proxy domain is extensively investigated.
{"title":"Prediction Consistency and Confidence-Based Proxy Domain Construction for Privacy-Preserving in Cross-Subject EEG Classification.","authors":"Yong Peng, Jiangchuan Liu, Honggang Liu, Natasha Padfield, Junhua Li, Wanzeng Kong, Bao-Liang Lu, Andrzej Cichocki","doi":"10.1109/JBHI.2025.3595826","DOIUrl":"10.1109/JBHI.2025.3595826","url":null,"abstract":"<p><p>Domainadaptation has proven effective for suppressing the inter-subject variability problem in cross-subject EEG classification tasks in which labeled data is available for source subjects while only unlabeled data is provided for target subjects. Existing domain adaptation methods typically reduced the distribution discrepancy between source and target domains by directly utilizing source domain samples or features. To safeguard the privacy of source domain data, we propose to construct a Proxy Domain by simultaneously considering the prediction Consistency and Confidence (PDCC) of locally trained source models on target EEG samples, serving as the substitute to the source domain. The framework commences with the augmentation and alignment of the source domain data to enhance feature generalizability, after which source models are trained independently on each source subject's data in a decentralized manner. Knowledge transfer from source to target domains is achieved exclusively through accessing to the source domain model, enabling the PDCC-based proxy domain construction that encapsulates the source knowledge. Finally, domain adaptation is performed using the proxy domain and target domain. As a result, PDCC eliminates the need to access source domain data while effectively leveraging source knowledge. Experimental results on four benchmark EEG datasets demonstrate that PDCC consistently outperforms eleven existing methods, including several advanced transfer learning and source-free methods. Especially, the effectiveness of the proxy domain is extensively investigated.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1115-1127"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144794245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3601969
Ahsanul Islam, Sadia Akter, Tahsina Farah Sanam
Preserving patient privacy in digital healthcare systems is a critical challenge, particularly in non-intrusive monitoring applications. This paper introduces VitalCrypt, a novel framework for secure and real-time vital sign monitoring that combines Channel State Information (CSI) with homomorphic encryption and lightweight deep learning. Homomorphic encryption enables computations directly on encrypted data, ensuring data confidentiality throughout the processing pipeline. The framework incorporates well-established signal preprocessing techniques, such as Hampel, Savitzky-Golay, and elliptic filters for noise removal, with Principal Component Analysis (PCA) for dimensionality reduction. The Power Spectral Density (PSD) of these refined signals is used as features, which are then fed into a lightweight neural network optimized with encryption-compatible activation functions for classification. The system effectively classifies breathing and heart rates while maintaining compatibility with homomorphic encryption schemes. Experimental evaluations were conducted using a publicly available dataset. The results demonstrated exceptional accuracy, achieving 99.46% for breathing rate classification on plain data and 99.44% on encrypted data, with negligible performance degradation despite increased runtime due to encryption. The results of heart rate classification are also discussed. The framework processes encrypted data at approximately seven times the latency of plain data; however, this trade-off is justified by the substantial privacy benefits attained. VitalCrypt showcases the potential of secure, privacy-preserving deep learning applications in healthcare, addressing critical challenges in real-time, non-intrusive patient monitoring. By balancing high accuracy and data confidentiality, this framework provides a scalable solution for healthcare applications, including remote monitoring and clinical diagnostics.
{"title":"Secure Tracking of Patient's Vital Signs Using CSI-Based Homomorphic Encryption-Enabled Deep Learning Framework.","authors":"Ahsanul Islam, Sadia Akter, Tahsina Farah Sanam","doi":"10.1109/JBHI.2025.3601969","DOIUrl":"10.1109/JBHI.2025.3601969","url":null,"abstract":"<p><p>Preserving patient privacy in digital healthcare systems is a critical challenge, particularly in non-intrusive monitoring applications. This paper introduces VitalCrypt, a novel framework for secure and real-time vital sign monitoring that combines Channel State Information (CSI) with homomorphic encryption and lightweight deep learning. Homomorphic encryption enables computations directly on encrypted data, ensuring data confidentiality throughout the processing pipeline. The framework incorporates well-established signal preprocessing techniques, such as Hampel, Savitzky-Golay, and elliptic filters for noise removal, with Principal Component Analysis (PCA) for dimensionality reduction. The Power Spectral Density (PSD) of these refined signals is used as features, which are then fed into a lightweight neural network optimized with encryption-compatible activation functions for classification. The system effectively classifies breathing and heart rates while maintaining compatibility with homomorphic encryption schemes. Experimental evaluations were conducted using a publicly available dataset. The results demonstrated exceptional accuracy, achieving 99.46% for breathing rate classification on plain data and 99.44% on encrypted data, with negligible performance degradation despite increased runtime due to encryption. The results of heart rate classification are also discussed. The framework processes encrypted data at approximately seven times the latency of plain data; however, this trade-off is justified by the substantial privacy benefits attained. VitalCrypt showcases the potential of secure, privacy-preserving deep learning applications in healthcare, addressing critical challenges in real-time, non-intrusive patient monitoring. By balancing high accuracy and data confidentiality, this framework provides a scalable solution for healthcare applications, including remote monitoring and clinical diagnostics.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1314-1327"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144952144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3636120
Zhiwen Wu, Fei Gu, Jing Wu, Shikun Sun, Changsheng Ma
Spontaneous Echo Contrast (SEC) is a swirling smoke-like echo phenomenon in Transesophageal Echocardiography (TEE) videos caused by slow blood flow and hypercoagulable states. It is a significant indicator for assessing thromboembolic risk. However, current SEC identification requires extensive manual intervention, leading to low accuracy, high costs, and subjectivity. To address these issues, we propose TEENet, an effective clinical detection network for identifying SEC in TEE videos. Specifically, TEENet first generates attention maps for the input clips to highlight important regions and integrates Convolutional Neural Network with the Multi-Head Self-Attention to capture spatiotemporal representations. Furthermore, to enhance the classification performance across different SEC severity grades, we introduce an auxiliary classification module, which simultaneously utilizes the main classification head and auxiliary classification heads. Notably, we constructed a comprehensive dataset of 1106 TEE videos collected during clinical examinations performed at the First Affiliated Hospital of Soochow University from 2018 to 2023, providing a solid foundation for the development and validation of TEENet. Extensive experimental results demonstrate that our proposed network achieves the highest SEC identification accuracy of 92.4$pm$1.3% compared to other spatiotemporal representation networks such as SlowFastR50 (89.6$pm$0.7%) and TimeSformer (74.9$pm$1.8%), which shows strong potential for effective auxiliary diagnosis in clinical practice.
{"title":"TEENet: An Effective Clinical Detection Network for Identifying Spontaneous Echo Contrast Automatically.","authors":"Zhiwen Wu, Fei Gu, Jing Wu, Shikun Sun, Changsheng Ma","doi":"10.1109/JBHI.2025.3636120","DOIUrl":"10.1109/JBHI.2025.3636120","url":null,"abstract":"<p><p>Spontaneous Echo Contrast (SEC) is a swirling smoke-like echo phenomenon in Transesophageal Echocardiography (TEE) videos caused by slow blood flow and hypercoagulable states. It is a significant indicator for assessing thromboembolic risk. However, current SEC identification requires extensive manual intervention, leading to low accuracy, high costs, and subjectivity. To address these issues, we propose TEENet, an effective clinical detection network for identifying SEC in TEE videos. Specifically, TEENet first generates attention maps for the input clips to highlight important regions and integrates Convolutional Neural Network with the Multi-Head Self-Attention to capture spatiotemporal representations. Furthermore, to enhance the classification performance across different SEC severity grades, we introduce an auxiliary classification module, which simultaneously utilizes the main classification head and auxiliary classification heads. Notably, we constructed a comprehensive dataset of 1106 TEE videos collected during clinical examinations performed at the First Affiliated Hospital of Soochow University from 2018 to 2023, providing a solid foundation for the development and validation of TEENet. Extensive experimental results demonstrate that our proposed network achieves the highest SEC identification accuracy of 92.4$pm$1.3% compared to other spatiotemporal representation networks such as SlowFastR50 (89.6$pm$0.7%) and TimeSformer (74.9$pm$1.8%), which shows strong potential for effective auxiliary diagnosis in clinical practice.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1365-1377"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145596360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2024.3505955
Yizhen Luo, Jiahuan Zhang, Siqi Fan, Kai Yang, Massimo Hong, Yushuai Wu, Mu Qiao, Zaiqing Nie
Recent advances in large language models (LLMs) like ChatGPT have shed light on the development of knowledgeable and versatile AI research assistants in various scientific domains. However, they fall short in biomedical applications due to a lack of proprietary biomedical knowledge and deficiencies in handling biological sequences for molecules and proteins. To address these issues, we present BioMedGPT, a multimodal large language model for assisting biomedical research. We first incorporate domain expertise into LLMs by incremental pre-training on large-scale biomedical literature. Then, we harmonize 2D molecular graphs, protein sequences, and natural language within a unified, parameter-efficient fusion architecture by fine-tuning on multimodal question-answering datasets. Through comprehensive experiments, we show that BioMedGPT performs on par with human experts in comprehending biomedical documents and answering research questions. It also exhibits promising capability in analyzing intricate functions and properties of novel molecules and proteins, surpassing state-of-the-art LLMs by 17.1% and 49.8% absolute gains respectively in ROUGE-L on molecule and protein question-answering.
{"title":"BioMedGPT: An Open Multimodal Large Language Model for BioMedicine.","authors":"Yizhen Luo, Jiahuan Zhang, Siqi Fan, Kai Yang, Massimo Hong, Yushuai Wu, Mu Qiao, Zaiqing Nie","doi":"10.1109/JBHI.2024.3505955","DOIUrl":"10.1109/JBHI.2024.3505955","url":null,"abstract":"<p><p>Recent advances in large language models (LLMs) like ChatGPT have shed light on the development of knowledgeable and versatile AI research assistants in various scientific domains. However, they fall short in biomedical applications due to a lack of proprietary biomedical knowledge and deficiencies in handling biological sequences for molecules and proteins. To address these issues, we present BioMedGPT, a multimodal large language model for assisting biomedical research. We first incorporate domain expertise into LLMs by incremental pre-training on large-scale biomedical literature. Then, we harmonize 2D molecular graphs, protein sequences, and natural language within a unified, parameter-efficient fusion architecture by fine-tuning on multimodal question-answering datasets. Through comprehensive experiments, we show that BioMedGPT performs on par with human experts in comprehending biomedical documents and answering research questions. It also exhibits promising capability in analyzing intricate functions and properties of novel molecules and proteins, surpassing state-of-the-art LLMs by 17.1% and 49.8% absolute gains respectively in ROUGE-L on molecule and protein question-answering.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"981-992"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143604651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3606992
Rafic Nader, Vincent L'Allinec, Romain Bourcier, Florent Autrusseau
Intracranial aneurysms (ICA) commonly occur in specific segments of the Circle of Willis (CoW), primarily, onto thirteen major arterial bifurcations. An accurate detection of these critical landmarks is necessary for a prompt and efficient diagnosis. We introduce a fully automated landmark detection approach for CoW bifurcations using a two-step neural networks process. Initially, an object detection network identifies regions of interest (ROIs) proximal to the landmark locations. Subsequently, a modified U-Net with deep supervision is exploited to accurately locate the bifurcations. This two-step method reduces various problems, such as the missed detections caused by two landmarks being close to each other and having similar visual characteristics, especially when processing the complete MRA Time-of-Flight (TOF). Additionally, it accounts for the anatomical variability of the CoW, which affects the number of detectable landmarks per scan. We assessed the effectiveness of our approach using two cerebral MRA datasets: our In-House dataset which had varying numbers of landmarks, and a public dataset with standardized landmark configuration. Our experimental results demonstrate that our method achieves the highest level of performance on a bifurcation detection task.
{"title":"Two-Steps Neural Networks for an Automated Cerebrovascular Landmark Detection Along the Circle of Willis.","authors":"Rafic Nader, Vincent L'Allinec, Romain Bourcier, Florent Autrusseau","doi":"10.1109/JBHI.2025.3606992","DOIUrl":"10.1109/JBHI.2025.3606992","url":null,"abstract":"<p><p>Intracranial aneurysms (ICA) commonly occur in specific segments of the Circle of Willis (CoW), primarily, onto thirteen major arterial bifurcations. An accurate detection of these critical landmarks is necessary for a prompt and efficient diagnosis. We introduce a fully automated landmark detection approach for CoW bifurcations using a two-step neural networks process. Initially, an object detection network identifies regions of interest (ROIs) proximal to the landmark locations. Subsequently, a modified U-Net with deep supervision is exploited to accurately locate the bifurcations. This two-step method reduces various problems, such as the missed detections caused by two landmarks being close to each other and having similar visual characteristics, especially when processing the complete MRA Time-of-Flight (TOF). Additionally, it accounts for the anatomical variability of the CoW, which affects the number of detectable landmarks per scan. We assessed the effectiveness of our approach using two cerebral MRA datasets: our In-House dataset which had varying numbers of landmarks, and a public dataset with standardized landmark configuration. Our experimental results demonstrate that our method achieves the highest level of performance on a bifurcation detection task.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"1353-1364"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145023207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/JBHI.2025.3566167
Daniel Foronda-Pascual, Carmen Camara, Pedro Peris-Lopez
Biometric data are extensively used in modern healthcare systems and is often transmitted over networks for various purposes, raising inherent privacy and security concerns. Wearable devices, smartphones, and Internet of Things (IoT) technologies are common sources of such data, which are susceptible to interception during transmission. To mitigate these risks, cancelable biometrics offer a promising solution by enabling secure and privacy-preserving identification. In this study, we propose a cancelable identification model based on contactless heart signals acquired via continuous-wave radar. The recorded signal, which reflects cardiac motion, is first transformed into a scalogram. Feature extraction is then performed using Convolutional Neural Networks (CNNs), comparing models trained via transfer learning with those trained solely on the dataset. Before classification, the extracted features are converted into cancelable templates using Gaussian Random Projection (GRP), and classification is performed using a Multilayer Perceptron (MLP). The proposed method demonstrates feasibility, achieving 91.20% accuracy across all scenarios in the dataset, which increases to 95.40% when focusing solely on the resting scenario. Additionally, CNNs trained exclusively on the dataset outperform pre-trained models using transfer learning in feature extraction performance.
{"title":"Untouchable and Cancelable Biometrics: Human Identification in Various Physiological States Using Radar-Based Heart Signals.","authors":"Daniel Foronda-Pascual, Carmen Camara, Pedro Peris-Lopez","doi":"10.1109/JBHI.2025.3566167","DOIUrl":"10.1109/JBHI.2025.3566167","url":null,"abstract":"<p><p>Biometric data are extensively used in modern healthcare systems and is often transmitted over networks for various purposes, raising inherent privacy and security concerns. Wearable devices, smartphones, and Internet of Things (IoT) technologies are common sources of such data, which are susceptible to interception during transmission. To mitigate these risks, cancelable biometrics offer a promising solution by enabling secure and privacy-preserving identification. In this study, we propose a cancelable identification model based on contactless heart signals acquired via continuous-wave radar. The recorded signal, which reflects cardiac motion, is first transformed into a scalogram. Feature extraction is then performed using Convolutional Neural Networks (CNNs), comparing models trained via transfer learning with those trained solely on the dataset. Before classification, the extracted features are converted into cancelable templates using Gaussian Random Projection (GRP), and classification is performed using a Multilayer Perceptron (MLP). The proposed method demonstrates feasibility, achieving 91.20% accuracy across all scenarios in the dataset, which increases to 95.40% when focusing solely on the resting scenario. Additionally, CNNs trained exclusively on the dataset outperform pre-trained models using transfer learning in feature extraction performance.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":"921-934"},"PeriodicalIF":6.8,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143984356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}