Pub Date : 2025-02-27DOI: 10.1109/JBHI.2025.3546288
Ke Liu, Xin Xing, Tao Yang, Zhuliang Yu, Bin Xiao, Guoyin Wang, Wei Wu
Objective: Accurate decoding of electroencephalogram (EEG) signals has become more significant for the brain-computer interface (BCI). Specifically, motor imagery and motor execution (MI/ME) tasks enable the control of external devices by decoding EEG signals during imagined or real movements. However, accurately decoding MI/ME signals remains a challenge due to the limited utilization of temporal information and ineffective feature selection methods.
Methods: This paper introduces DMSACNN, an end-to-end deep multiscale attention convolutional neural network for MI/ME-EEG decoding. DMSACNN incorporates a deep multiscale temporal feature extraction module to capture temporal features at various levels. These features are then processed by a spatial convolutional module to extract spatial features. Finally, a local and global feature fusion attention module is utilized to combine local and global information and extract the most discriminative spatiotemporal features.
Main results: DMSACNN achieves impressive accuracies of 78.20%, 96.34% and 70.90% for hold-out analysis on the BCI-IV-2a, High Gamma and OpenBMI datasets, respectively, outperforming most of the state-of-the-art methods.
Conclusion and significance: These results highlight the potential of DMSACNN in robust BCI applications. Our proposed method provides a valuable solution to improve the accuracy of the MI/ME-EEG decoding, which can pave the way for more efficient and reliable BCI systems. The source code for DMSACNN is available at https://github.com/xingxin-99/DMSANet.git.
{"title":"DMSACNN: Deep Multiscale Attentional Convolutional Neural Network for EEG-Based Motor Decoding.","authors":"Ke Liu, Xin Xing, Tao Yang, Zhuliang Yu, Bin Xiao, Guoyin Wang, Wei Wu","doi":"10.1109/JBHI.2025.3546288","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546288","url":null,"abstract":"<p><strong>Objective: </strong>Accurate decoding of electroencephalogram (EEG) signals has become more significant for the brain-computer interface (BCI). Specifically, motor imagery and motor execution (MI/ME) tasks enable the control of external devices by decoding EEG signals during imagined or real movements. However, accurately decoding MI/ME signals remains a challenge due to the limited utilization of temporal information and ineffective feature selection methods.</p><p><strong>Methods: </strong>This paper introduces DMSACNN, an end-to-end deep multiscale attention convolutional neural network for MI/ME-EEG decoding. DMSACNN incorporates a deep multiscale temporal feature extraction module to capture temporal features at various levels. These features are then processed by a spatial convolutional module to extract spatial features. Finally, a local and global feature fusion attention module is utilized to combine local and global information and extract the most discriminative spatiotemporal features.</p><p><strong>Main results: </strong>DMSACNN achieves impressive accuracies of 78.20%, 96.34% and 70.90% for hold-out analysis on the BCI-IV-2a, High Gamma and OpenBMI datasets, respectively, outperforming most of the state-of-the-art methods.</p><p><strong>Conclusion and significance: </strong>These results highlight the potential of DMSACNN in robust BCI applications. Our proposed method provides a valuable solution to improve the accuracy of the MI/ME-EEG decoding, which can pave the way for more efficient and reliable BCI systems. The source code for DMSACNN is available at https://github.com/xingxin-99/DMSANet.git.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-27DOI: 10.1109/JBHI.2025.3546345
Jin Huang, Yazhao Mao, Jingwen Deng, Zhaoyi Ye, Yimin Zhang, Jingwen Zhang, Lan Dong, Hui Shen, Jinxuan Hou, Yu Xu, Xiaoxiao Li, Sheng Liu, Du Wang, Shengrong Sun, Liye Mei, Cheng Lei
Breast cancer is one of the most prevalent diseases for women worldwide. Early and accurate ultrasound image segmentation plays a crucial role in reducing mortality. Although deep learning methods have demonstrated remarkable segmentation potential, they still struggle with challenges in ultrasound images, including blurred boundaries and speckle noise. To generate accurate ultrasound image segmentation, this paper proposes the Edge-Aware Multi-Scale Group-Mix Attention Network (EMGANet), which generates accurate segmentation by integrating deep and edge features. The Multi-Scale Group Mix Attention block effectively aggregates both sparse global and local features, ensuring the extraction of valuable information. The subsequent Edge Feature Enhancement block then focuses on cancer boundaries, enhancing the segmentation accuracy. Therefore, EMGANet effectively tackles unclear boundaries and noise in ultrasound images. We conduct experiments on two public datasets (Dataset- B, BUSI) and one private dataset which contains 927 samples from Renmin Hospital of Wuhan University (BUSIWHU). EMGANet demonstrates superior segmentation performance, achieving an overall accuracy (OA) of 98.56%, a mean IoU (mIoU) of 90.32%, and an ASSD of 6.1 pixels on the BUSI-WHU dataset. Additionally, EMGANet performs well on two public datasets, with a mIoU of 88.2% and an ASSD of 9.2 pixels on Dataset-B, and a mIoU of 81.37% and an ASSD of 18.27 pixels on the BUSI dataset. EMGANet achieves a state-of-the-art segmentation performance of about 2% in mIoU across three datasets. In summary, the proposed EMGANet significantly improves breast cancer segmentation through Edge-Aware and Group-Mix Attention mechanisms, showing great potential for clinical applications.
{"title":"EMGANet: Edge-Aware Multi-Scale Group-Mix Attention Network for Breast Cancer Ultrasound Image Segmentation.","authors":"Jin Huang, Yazhao Mao, Jingwen Deng, Zhaoyi Ye, Yimin Zhang, Jingwen Zhang, Lan Dong, Hui Shen, Jinxuan Hou, Yu Xu, Xiaoxiao Li, Sheng Liu, Du Wang, Shengrong Sun, Liye Mei, Cheng Lei","doi":"10.1109/JBHI.2025.3546345","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546345","url":null,"abstract":"<p><p>Breast cancer is one of the most prevalent diseases for women worldwide. Early and accurate ultrasound image segmentation plays a crucial role in reducing mortality. Although deep learning methods have demonstrated remarkable segmentation potential, they still struggle with challenges in ultrasound images, including blurred boundaries and speckle noise. To generate accurate ultrasound image segmentation, this paper proposes the Edge-Aware Multi-Scale Group-Mix Attention Network (EMGANet), which generates accurate segmentation by integrating deep and edge features. The Multi-Scale Group Mix Attention block effectively aggregates both sparse global and local features, ensuring the extraction of valuable information. The subsequent Edge Feature Enhancement block then focuses on cancer boundaries, enhancing the segmentation accuracy. Therefore, EMGANet effectively tackles unclear boundaries and noise in ultrasound images. We conduct experiments on two public datasets (Dataset- B, BUSI) and one private dataset which contains 927 samples from Renmin Hospital of Wuhan University (BUSIWHU). EMGANet demonstrates superior segmentation performance, achieving an overall accuracy (OA) of 98.56%, a mean IoU (mIoU) of 90.32%, and an ASSD of 6.1 pixels on the BUSI-WHU dataset. Additionally, EMGANet performs well on two public datasets, with a mIoU of 88.2% and an ASSD of 9.2 pixels on Dataset-B, and a mIoU of 81.37% and an ASSD of 18.27 pixels on the BUSI dataset. EMGANet achieves a state-of-the-art segmentation performance of about 2% in mIoU across three datasets. In summary, the proposed EMGANet significantly improves breast cancer segmentation through Edge-Aware and Group-Mix Attention mechanisms, showing great potential for clinical applications.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-26DOI: 10.1109/JBHI.2025.3545156
Kim-Ngoc T Le, Gyurin Byun, Syed M Raza, Duc-Tai Le, Hyunseung Choo
An automated analysis of respiratory sounds using Deep Learning (DL) plays a pivotal role in the early detection of lung diseases. However, current DL methods often examine the spatial and temporal characteristics of respiratory sounds in isolation, which inherently limit their potential. This study proposes a novel DL framework that captures spatial features through convolution operations and exploits the spatiotemporal correlations of these features using temporal convolution networks. The proposed framework incorporates Multi-Level Temporal Convolutional Networks (ML-TCN) to considerably enhance the model accuracy in detecting anomaly breathing cycles and respiratory recordings from lung sound audio. Moreover, a transfer learning technique is also employed to extract semantic features efficiently from limited and imbalanced data in this domain. Thorough experiments on the well-known ICBHI 2017 challenge dataset show that the proposed framework outperforms state-of-the-art methods in both binary and multi-class classification tasks for respiratory anomaly and disease detection. In particular, improvements of up to 2.29% and 2.27% in terms of the Score metric, average sensitivity and specificity, are demonstrated in binary and multi-class anomaly breathing cycle detection tasks, respectively. In respiratory recording classification tasks, the classification accuracy is improved by 2.69% for healthy-unhealthy binary classification and 1.47% for healthy, chronic, and non-chronic diagnosis. These results highlight the marked advantage of the ML-TCN over existing techniques, showcasing its potential to drive future innovations in respiratory healthcare technology.
{"title":"Respiratory Anomaly and Disease Detection Using Multi-Level Temporal Convolutional Networks.","authors":"Kim-Ngoc T Le, Gyurin Byun, Syed M Raza, Duc-Tai Le, Hyunseung Choo","doi":"10.1109/JBHI.2025.3545156","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545156","url":null,"abstract":"<p><p>An automated analysis of respiratory sounds using Deep Learning (DL) plays a pivotal role in the early detection of lung diseases. However, current DL methods often examine the spatial and temporal characteristics of respiratory sounds in isolation, which inherently limit their potential. This study proposes a novel DL framework that captures spatial features through convolution operations and exploits the spatiotemporal correlations of these features using temporal convolution networks. The proposed framework incorporates Multi-Level Temporal Convolutional Networks (ML-TCN) to considerably enhance the model accuracy in detecting anomaly breathing cycles and respiratory recordings from lung sound audio. Moreover, a transfer learning technique is also employed to extract semantic features efficiently from limited and imbalanced data in this domain. Thorough experiments on the well-known ICBHI 2017 challenge dataset show that the proposed framework outperforms state-of-the-art methods in both binary and multi-class classification tasks for respiratory anomaly and disease detection. In particular, improvements of up to 2.29% and 2.27% in terms of the Score metric, average sensitivity and specificity, are demonstrated in binary and multi-class anomaly breathing cycle detection tasks, respectively. In respiratory recording classification tasks, the classification accuracy is improved by 2.69% for healthy-unhealthy binary classification and 1.47% for healthy, chronic, and non-chronic diagnosis. These results highlight the marked advantage of the ML-TCN over existing techniques, showcasing its potential to drive future innovations in respiratory healthcare technology.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-26DOI: 10.1109/JBHI.2025.3545856
Zhen Ma, Xinyi Yang, Jiayuan Meng, Kun Wang, Minpeng Xu, Dong Ming
Detecting arm movement direction is significant for individuals with upper-limb motor disabilities to restore independent self-care abilities. It involves accurately decoding the fine movement patterns of the arm, which has become feasible using invasive brain-computer interfaces (BCIs). However, it is still a significant challenge for traditional electroencephalography (EEG) based BCIs to decode multi-directional arm movements effectively. This study designed an ultra-high-density (UHD) EEG system to decode multi-directional arm movements. The system contains 200 electrodes with an interval of about 4 mm. We analyzed the patterns of the UHD EEG signals induced by arm movements in different directions. To extract discriminative features from UHD EEG, we proposed a spatial filtering method combining principal component analysis (PCA) and discriminative spatial pattern (DSP). We collected EEG signals from five healthy subjects (two left-handed and three right-handed) to verify the system's feasibility. The movement-related cortical potentials (MRCPs) showed a certain degree of separability both in waveforms and spatial patterns for arm movements in different directions. This study achieved an average classification accuracy of 63.15 (8.71)% for both arms (eight-class task) with a peak accuracy of 77.24%. For the dominant arm (four-class task), we obtained an average accuracy of 75.31 (9.21)% with a peak accuracy of 85.00%. For the first time, this study simultaneously decodes multi-directional movements of both arms using UHD EEG. This study provides a promising approach for detecting information about arm movement directions, which is significant for the development of BCIs.
{"title":"Decoding Arm Movement Direction Using Ultra-High-Density EEG.","authors":"Zhen Ma, Xinyi Yang, Jiayuan Meng, Kun Wang, Minpeng Xu, Dong Ming","doi":"10.1109/JBHI.2025.3545856","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545856","url":null,"abstract":"<p><p>Detecting arm movement direction is significant for individuals with upper-limb motor disabilities to restore independent self-care abilities. It involves accurately decoding the fine movement patterns of the arm, which has become feasible using invasive brain-computer interfaces (BCIs). However, it is still a significant challenge for traditional electroencephalography (EEG) based BCIs to decode multi-directional arm movements effectively. This study designed an ultra-high-density (UHD) EEG system to decode multi-directional arm movements. The system contains 200 electrodes with an interval of about 4 mm. We analyzed the patterns of the UHD EEG signals induced by arm movements in different directions. To extract discriminative features from UHD EEG, we proposed a spatial filtering method combining principal component analysis (PCA) and discriminative spatial pattern (DSP). We collected EEG signals from five healthy subjects (two left-handed and three right-handed) to verify the system's feasibility. The movement-related cortical potentials (MRCPs) showed a certain degree of separability both in waveforms and spatial patterns for arm movements in different directions. This study achieved an average classification accuracy of 63.15 (8.71)% for both arms (eight-class task) with a peak accuracy of 77.24%. For the dominant arm (four-class task), we obtained an average accuracy of 75.31 (9.21)% with a peak accuracy of 85.00%. For the first time, this study simultaneously decodes multi-directional movements of both arms using UHD EEG. This study provides a promising approach for detecting information about arm movement directions, which is significant for the development of BCIs.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate classification of port wine stains (PWS, vascular malformations present at birth), is critical for subsequent treatment planning. However, the current method of classifying PWS based on the external skin appearance rarely reflects the underlying angiopathological heterogeneity of PWS lesions, resulting in inconsistent outcomes with the common vascular-targeted photodynamic therapy (V-PDT) treatments. Conversely, optical coherence tomography angiography (OCTA) is an ideal tool for visualizing the vascular malformations of PWS. Previous studies have shown no significant correlation between OCTA quantitative metrics and the PWS subtypes determined by the current classification approach. In this study, we propose a novel fine-grained classification method for PWS that integrates OCT and OCTA imaging. Utilizing a machine learning-based approach, we subdivided PWS into five distinct subtypes by unearthing the heterogeneity of hypodermic histopathology and vessel structures. Six quantitative metrics, encompassing vascular morphology and depth information of PWS lesions, were designed and statistically analyzed to evaluate angiopathological differences among the subtypes. Our classification reveals significant distinctions across all metrics compared to conventional skin appearance-based subtypes, demonstrating its ability to accurately capture angiopathological heterogeneity. This research marks the first attempt to classify PWS based on angiopathology, potentially guiding more effective subtyping and treatment strategies for PWS.
{"title":"Fine-grained Classification Reveals Angiopathological Heterogeneity of Port Wine Stains Using OCT and OCTA Features.","authors":"Xiaofeng Deng, Defu Chen, Bowen Liu, Xiwan Zhang, Haixia Qiu, Wu Yuan, Hongliang Ren","doi":"10.1109/JBHI.2025.3545931","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545931","url":null,"abstract":"<p><p>Accurate classification of port wine stains (PWS, vascular malformations present at birth), is critical for subsequent treatment planning. However, the current method of classifying PWS based on the external skin appearance rarely reflects the underlying angiopathological heterogeneity of PWS lesions, resulting in inconsistent outcomes with the common vascular-targeted photodynamic therapy (V-PDT) treatments. Conversely, optical coherence tomography angiography (OCTA) is an ideal tool for visualizing the vascular malformations of PWS. Previous studies have shown no significant correlation between OCTA quantitative metrics and the PWS subtypes determined by the current classification approach. In this study, we propose a novel fine-grained classification method for PWS that integrates OCT and OCTA imaging. Utilizing a machine learning-based approach, we subdivided PWS into five distinct subtypes by unearthing the heterogeneity of hypodermic histopathology and vessel structures. Six quantitative metrics, encompassing vascular morphology and depth information of PWS lesions, were designed and statistically analyzed to evaluate angiopathological differences among the subtypes. Our classification reveals significant distinctions across all metrics compared to conventional skin appearance-based subtypes, demonstrating its ability to accurately capture angiopathological heterogeneity. This research marks the first attempt to classify PWS based on angiopathology, potentially guiding more effective subtyping and treatment strategies for PWS.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-26DOI: 10.1109/JBHI.2025.3546148
Cem O Yaldiz, David J Lin, Asim H Gazi, Gabriela Cestero, Chuoqi Chen, Bethany K Bracken, Aaron Winder, Spencer Lynn, Reza Sameni, Omer T Inan
Forecasting the near-exact moments of cardiac phases is crucial for several cardiovascular health applications. For instance, forecasts can enable the timing of specific stimuli (e.g., image or text presentation in psycholinguistic experiments) to coincide with cardiac phases like systole (cardiac ejection) and diastole (cardiac filling). This capability could be leveraged to enhance the amplitude of a subject's response, prompt them in fight-or-flight scenarios or conduct retrospective analysis for physiological predictive models. While autoregressive models have been employed for physiological signal forecasting, no prior study has explored their application to forecasting aortic opening and closing timings. This work addresses this gap by presenting a comprehensive comparative analysis of autoregressive models, including various forms of Kalman filter-based implementations, that use previously detected R-peak, aortic opening, and closing timings from electrocardiogram (ECG) and seismocardiogram (SCG) to forecast subsequent timings. We evaluate the robustness of these models to noise introduced in both SCG signals and the output of feature detectors. Our findings indicate that time-varying and multi-feature algorithms outperform others, with forecast errors below 2 ms for R-peak, below 3 ms for aortic opening timing, and below 10 ms for aortic closing timing. Importantly, we elucidate the distinct advantages of integrating multi-feature models, which improve noise robustness, and time-varying approaches, which adapt to rapid physiological changes. These models can be extended to a wide range of short-term physiological predictive systems, such as acute stress detection, neuromodulation sensor feedback, or muscle fatigue monitoring, broadening their applicability beyond cardiac feature forecasting.
{"title":"Real-Time Autoregressive Forecast of Cardiac Features for Psychophysiological Applications.","authors":"Cem O Yaldiz, David J Lin, Asim H Gazi, Gabriela Cestero, Chuoqi Chen, Bethany K Bracken, Aaron Winder, Spencer Lynn, Reza Sameni, Omer T Inan","doi":"10.1109/JBHI.2025.3546148","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546148","url":null,"abstract":"<p><p>Forecasting the near-exact moments of cardiac phases is crucial for several cardiovascular health applications. For instance, forecasts can enable the timing of specific stimuli (e.g., image or text presentation in psycholinguistic experiments) to coincide with cardiac phases like systole (cardiac ejection) and diastole (cardiac filling). This capability could be leveraged to enhance the amplitude of a subject's response, prompt them in fight-or-flight scenarios or conduct retrospective analysis for physiological predictive models. While autoregressive models have been employed for physiological signal forecasting, no prior study has explored their application to forecasting aortic opening and closing timings. This work addresses this gap by presenting a comprehensive comparative analysis of autoregressive models, including various forms of Kalman filter-based implementations, that use previously detected R-peak, aortic opening, and closing timings from electrocardiogram (ECG) and seismocardiogram (SCG) to forecast subsequent timings. We evaluate the robustness of these models to noise introduced in both SCG signals and the output of feature detectors. Our findings indicate that time-varying and multi-feature algorithms outperform others, with forecast errors below 2 ms for R-peak, below 3 ms for aortic opening timing, and below 10 ms for aortic closing timing. Importantly, we elucidate the distinct advantages of integrating multi-feature models, which improve noise robustness, and time-varying approaches, which adapt to rapid physiological changes. These models can be extended to a wide range of short-term physiological predictive systems, such as acute stress detection, neuromodulation sensor feedback, or muscle fatigue monitoring, broadening their applicability beyond cardiac feature forecasting.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Progressive cognitive decline spanning across decades is characteristic of Alzheimer's disease (AD). Various predictive models have been designed to realize its early onset and study the long-term trajectories of cognitive test scores across populations of interest. Research efforts have been geared towards superimposing patients' cognitive test scores with the long-term trajectory denoting gradual cognitive decline, while considering the heterogeneity of AD. Multiple trajectories representing cognitive assessment for the long-term have been developed based on various parameters, highlighting the importance of classifying several groups based on disease progression patterns. In this study, a novel method capable of self-organized prediction, classification, and the overlay of long-term cognitive trajectories based on short-term individual data was developed, based on statistical and differential equation modeling. Here, "self-organized" denotes a data-driven mechanism by which the prediction model adaptively configures its structure and parameters to classify individuals and estimate long-term trajectories. We validated the predictive accuracy of the proposed method on two cohorts: the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the Japanese ADNI. We also presented two practical illustrations of the simultaneous evaluation of risk factor associated with both the onset and the longitudinal progression of AD, and an innovative randomized controlled trial design for AD that standardizes the heterogeneity of patients enrolled in a clinical trial. These resources would improve the power of statistical hypothesis testing and help evaluate the therapeutic effect. The application of predicting the trajectory of longitudinal disease progression goes beyond AD, and is especially relevant for progressive and neurodegenerative disorders.
{"title":"Self-Organized Prediction-Classification-Superposition of Longitudinal Cognitive Decline in Alzheimer's Disease: An Application to Novel Clinical Research Methodology.","authors":"Hiroyuki Sato, Ryoichi Hanazawa, Keisuke Suzuki, Atsushi Hashizume, Akihiro Hirakawa","doi":"10.1109/JBHI.2025.3546020","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546020","url":null,"abstract":"<p><p>Progressive cognitive decline spanning across decades is characteristic of Alzheimer's disease (AD). Various predictive models have been designed to realize its early onset and study the long-term trajectories of cognitive test scores across populations of interest. Research efforts have been geared towards superimposing patients' cognitive test scores with the long-term trajectory denoting gradual cognitive decline, while considering the heterogeneity of AD. Multiple trajectories representing cognitive assessment for the long-term have been developed based on various parameters, highlighting the importance of classifying several groups based on disease progression patterns. In this study, a novel method capable of self-organized prediction, classification, and the overlay of long-term cognitive trajectories based on short-term individual data was developed, based on statistical and differential equation modeling. Here, \"self-organized\" denotes a data-driven mechanism by which the prediction model adaptively configures its structure and parameters to classify individuals and estimate long-term trajectories. We validated the predictive accuracy of the proposed method on two cohorts: the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the Japanese ADNI. We also presented two practical illustrations of the simultaneous evaluation of risk factor associated with both the onset and the longitudinal progression of AD, and an innovative randomized controlled trial design for AD that standardizes the heterogeneity of patients enrolled in a clinical trial. These resources would improve the power of statistical hypothesis testing and help evaluate the therapeutic effect. The application of predicting the trajectory of longitudinal disease progression goes beyond AD, and is especially relevant for progressive and neurodegenerative disorders.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-26DOI: 10.1109/JBHI.2025.3546019
Zehui Feng, Tongtong Zhou, Ting Han
Facial paralysis, as a common nerve system disease, seriously affects the patients' facial muscle function and appearance. Accurate facial paralysis grading is of great significance for the formulation of personalized treatment. Existing artificial intelligence based grading methods extensively focus on static image classification, which fails to capture the dynamic facial movements. Additionally, due to private concerns, building comprehensive facial paralysis datasets is challenging, making it impractical to fully train a robust model from scratch. Finally, maintaining precision and inference speed on edge devices remains a key challenge. To address these shortcomings, we propose MLST-Net, a novel and explainable three-stage deep-learning method based on multi-task learning. In the first stage, the pre-trained model is used to extract the facial static appearance structure and dynamic texture changes. The second stage fuses the proxy task results to construct a unified face semantic expression and outputs the "with or without facial paralysis" simple task results. In the third stage, we use spatial-temporal disentanglement to capture the spatial-temporal combinatorial-dependencies in video sequences. Finally, we input the classifier to get the results of complex tasks of facial paralysis classification. Compared with all advanced methods, MLST-Net is computationally inexpensive and achieves state-of-the-art results on the 1241 public dataset videos. It significantly benefits the digital diagnosis of facial palsy and offers innovative and explainable ideas for video-based digital medical treatment.
{"title":"MLST-Net: Multi-Task Learning based SpatialTemporal Disentanglement Scheme for Video Facial Paralysis Severity Grading.","authors":"Zehui Feng, Tongtong Zhou, Ting Han","doi":"10.1109/JBHI.2025.3546019","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546019","url":null,"abstract":"<p><p>Facial paralysis, as a common nerve system disease, seriously affects the patients' facial muscle function and appearance. Accurate facial paralysis grading is of great significance for the formulation of personalized treatment. Existing artificial intelligence based grading methods extensively focus on static image classification, which fails to capture the dynamic facial movements. Additionally, due to private concerns, building comprehensive facial paralysis datasets is challenging, making it impractical to fully train a robust model from scratch. Finally, maintaining precision and inference speed on edge devices remains a key challenge. To address these shortcomings, we propose MLST-Net, a novel and explainable three-stage deep-learning method based on multi-task learning. In the first stage, the pre-trained model is used to extract the facial static appearance structure and dynamic texture changes. The second stage fuses the proxy task results to construct a unified face semantic expression and outputs the \"with or without facial paralysis\" simple task results. In the third stage, we use spatial-temporal disentanglement to capture the spatial-temporal combinatorial-dependencies in video sequences. Finally, we input the classifier to get the results of complex tasks of facial paralysis classification. Compared with all advanced methods, MLST-Net is computationally inexpensive and achieves state-of-the-art results on the 1241 public dataset videos. It significantly benefits the digital diagnosis of facial palsy and offers innovative and explainable ideas for video-based digital medical treatment.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the scenarios for electrocardiogram (ECG) monitoring become increasingly diverse, particularly with the development of wearable ECG, the influence of ambiguous factors in diagnosis has been amplified. Reliable ECG information must be extracted from abundant noises and confusing artifacts. To address this issue, we suggest an uncertainty-inspired model for beat-level diagnosis (UI-Beat). The base architecture of UI-Beat separates heartbeat localization and event diagnosis in two branches to address the problem of heterogeneous data sources. To disentangle the epistemic and aleatoric uncertainty within one stage in a deterministic neural network, we propose a new method derived from uncertainty formulation and realize it by introducing the class-biased transformation. Then the disentangled uncertainty can be utilized to screen out noise and identify ambiguous heartbeat synchronously. The results indicate that UI-Beat can significantly improve the performance of noise detection (from 91.60% to 97.50% for real-world noise detection and from 61.40% to 82.41% for real-world artifact detection). For multi-lead ECG analysis, UI-Beat is approaching the performance upper bound in heartbeat localization (only 15 false positives and 9 false negatives out of the 175,907 heartbeats in the INCART database) and achieving a significant performance improvement in heartbeat classification through uncertainty-based cross-lead fusion compared to single-lead prediction and other state-of-the-art methods (an average improvement of 14.28% for detecting heartbeats of S and 3.37% for detecting heartbeats of V). Considering the characteristic of one-stage ECG analysis within one model, it is suggested that the proposed UI-Beat has the potential to be employed as a general model for arbitrary scenarios of ECG monitoring, with the capacity to remove invalid episodes, and realize heartbeat-level diagnosis with confidence provided.
{"title":"Uncertainty-Inspired Multi-Task Learning in Arbitrary Scenarios of ECG Monitoring.","authors":"Xingyao Wang, Hongxiang Gao, Caiyun Ma, Tingting Zhu, Feng Yang, Chengyu Liu, Huazhu Fu","doi":"10.1109/JBHI.2025.3545927","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545927","url":null,"abstract":"<p><p>As the scenarios for electrocardiogram (ECG) monitoring become increasingly diverse, particularly with the development of wearable ECG, the influence of ambiguous factors in diagnosis has been amplified. Reliable ECG information must be extracted from abundant noises and confusing artifacts. To address this issue, we suggest an uncertainty-inspired model for beat-level diagnosis (UI-Beat). The base architecture of UI-Beat separates heartbeat localization and event diagnosis in two branches to address the problem of heterogeneous data sources. To disentangle the epistemic and aleatoric uncertainty within one stage in a deterministic neural network, we propose a new method derived from uncertainty formulation and realize it by introducing the class-biased transformation. Then the disentangled uncertainty can be utilized to screen out noise and identify ambiguous heartbeat synchronously. The results indicate that UI-Beat can significantly improve the performance of noise detection (from 91.60% to 97.50% for real-world noise detection and from 61.40% to 82.41% for real-world artifact detection). For multi-lead ECG analysis, UI-Beat is approaching the performance upper bound in heartbeat localization (only 15 false positives and 9 false negatives out of the 175,907 heartbeats in the INCART database) and achieving a significant performance improvement in heartbeat classification through uncertainty-based cross-lead fusion compared to single-lead prediction and other state-of-the-art methods (an average improvement of 14.28% for detecting heartbeats of S and 3.37% for detecting heartbeats of V). Considering the characteristic of one-stage ECG analysis within one model, it is suggested that the proposed UI-Beat has the potential to be employed as a general model for arbitrary scenarios of ECG monitoring, with the capacity to remove invalid episodes, and realize heartbeat-level diagnosis with confidence provided.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-25DOI: 10.1109/JBHI.2025.3545138
Cheng Chen, Min Deng, Yuan Zhong, Jinyue Cai, Karen Kar Wun Chan, Qi Dou, Kelvin Kam Lung Chong, Pheng-Ann Heng, Winnie Chiu-Wing Chu
Thyroid-associated orbitopathy (TAO) is a prevalent inflammatory autoimmune disorder, leading to orbital disfigurement and visual disability. Automatic comprehensive segmentation tailored for quantitative multi-modal MRI assessment of TAO holds enormous promise but is still lacking. In this paper, we propose a novel method, named cross-modal attentive self-training (CMAST), for the multi-organ segmentation in TAO using partially labeled and unaligned multi-modal MRI data. Our method first introduces a dedicatedly designed cross-modal pseudo label self-training scheme, which leverages self-training to refine the initial pseudo labels generated by cross-modal registration, so as to complete the label sets for comprehensive segmentation. With the obtained pseudo labels, we further devise a learnable attentive fusion module to aggregate multi-modal knowledge based on learned cross-modal feature attention, which relaxes the requirement of pixel-wise alignment across modalities. A prototypical contrastive learning loss is further incorporated to facilitate cross-modal feature alignment. We evaluate our method on a large clinical TAO cohort with 100 cases of multi-modal orbital MRI. The experimental results demonstrate the promising performance of our method in achieving comprehensive segmentation of TAO-affected organs on both T1 and T1c modalities, outperforming previous methods by a large margin. Code will be released upon acceptance.
{"title":"Multi-organ Segmentation from Partially Labeled and Unaligned Multi-modal MRI in Thyroid-associated Orbitopathy.","authors":"Cheng Chen, Min Deng, Yuan Zhong, Jinyue Cai, Karen Kar Wun Chan, Qi Dou, Kelvin Kam Lung Chong, Pheng-Ann Heng, Winnie Chiu-Wing Chu","doi":"10.1109/JBHI.2025.3545138","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545138","url":null,"abstract":"<p><p>Thyroid-associated orbitopathy (TAO) is a prevalent inflammatory autoimmune disorder, leading to orbital disfigurement and visual disability. Automatic comprehensive segmentation tailored for quantitative multi-modal MRI assessment of TAO holds enormous promise but is still lacking. In this paper, we propose a novel method, named cross-modal attentive self-training (CMAST), for the multi-organ segmentation in TAO using partially labeled and unaligned multi-modal MRI data. Our method first introduces a dedicatedly designed cross-modal pseudo label self-training scheme, which leverages self-training to refine the initial pseudo labels generated by cross-modal registration, so as to complete the label sets for comprehensive segmentation. With the obtained pseudo labels, we further devise a learnable attentive fusion module to aggregate multi-modal knowledge based on learned cross-modal feature attention, which relaxes the requirement of pixel-wise alignment across modalities. A prototypical contrastive learning loss is further incorporated to facilitate cross-modal feature alignment. We evaluate our method on a large clinical TAO cohort with 100 cases of multi-modal orbital MRI. The experimental results demonstrate the promising performance of our method in achieving comprehensive segmentation of TAO-affected organs on both T1 and T1c modalities, outperforming previous methods by a large margin. Code will be released upon acceptance.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}