IEEE Journal of Biomedical and Health Informatics最新文献_第4页

MLST-Net: Multi-Task Learning based SpatialTemporal Disentanglement Scheme for Video Facial Paralysis Severity Grading.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-26 DOI: 10.1109/JBHI.2025.3546019

Zehui Feng, Tongtong Zhou, Ting Han

Facial paralysis, as a common nerve system disease, seriously affects the patients' facial muscle function and appearance. Accurate facial paralysis grading is of great significance for the formulation of personalized treatment. Existing artificial intelligence based grading methods extensively focus on static image classification, which fails to capture the dynamic facial movements. Additionally, due to private concerns, building comprehensive facial paralysis datasets is challenging, making it impractical to fully train a robust model from scratch. Finally, maintaining precision and inference speed on edge devices remains a key challenge. To address these shortcomings, we propose MLST-Net, a novel and explainable three-stage deep-learning method based on multi-task learning. In the first stage, the pre-trained model is used to extract the facial static appearance structure and dynamic texture changes. The second stage fuses the proxy task results to construct a unified face semantic expression and outputs the "with or without facial paralysis" simple task results. In the third stage, we use spatial-temporal disentanglement to capture the spatial-temporal combinatorial-dependencies in video sequences. Finally, we input the classifier to get the results of complex tasks of facial paralysis classification. Compared with all advanced methods, MLST-Net is computationally inexpensive and achieves state-of-the-art results on the 1241 public dataset videos. It significantly benefits the digital diagnosis of facial palsy and offers innovative and explainable ideas for video-based digital medical treatment.

{"title":"MLST-Net: Multi-Task Learning based SpatialTemporal Disentanglement Scheme for Video Facial Paralysis Severity Grading.","authors":"Zehui Feng, Tongtong Zhou, Ting Han","doi":"10.1109/JBHI.2025.3546019","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3546019","url":null,"abstract":"Facial paralysis, as a common nerve system disease, seriously affects the patients' facial muscle function and appearance. Accurate facial paralysis grading is of great significance for the formulation of personalized treatment. Existing artificial intelligence based grading methods extensively focus on static image classification, which fails to capture the dynamic facial movements. Additionally, due to private concerns, building comprehensive facial paralysis datasets is challenging, making it impractical to fully train a robust model from scratch. Finally, maintaining precision and inference speed on edge devices remains a key challenge. To address these shortcomings, we propose MLST-Net, a novel and explainable three-stage deep-learning method based on multi-task learning. In the first stage, the pre-trained model is used to extract the facial static appearance structure and dynamic texture changes. The second stage fuses the proxy task results to construct a unified face semantic expression and outputs the \"with or without facial paralysis\" simple task results. In the third stage, we use spatial-temporal disentanglement to capture the spatial-temporal combinatorial-dependencies in video sequences. Finally, we input the classifier to get the results of complex tasks of facial paralysis classification. Compared with all advanced methods, MLST-Net is computationally inexpensive and achieves state-of-the-art results on the 1241 public dataset videos. It significantly benefits the digital diagnosis of facial palsy and offers innovative and explainable ideas for video-based digital medical treatment.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Uncertainty-Inspired Multi-Task Learning in Arbitrary Scenarios of ECG Monitoring.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-26 DOI: 10.1109/JBHI.2025.3545927

Xingyao Wang, Hongxiang Gao, Caiyun Ma, Tingting Zhu, Feng Yang, Chengyu Liu, Huazhu Fu

As the scenarios for electrocardiogram (ECG) monitoring become increasingly diverse, particularly with the development of wearable ECG, the influence of ambiguous factors in diagnosis has been amplified. Reliable ECG information must be extracted from abundant noises and confusing artifacts. To address this issue, we suggest an uncertainty-inspired model for beat-level diagnosis (UI-Beat). The base architecture of UI-Beat separates heartbeat localization and event diagnosis in two branches to address the problem of heterogeneous data sources. To disentangle the epistemic and aleatoric uncertainty within one stage in a deterministic neural network, we propose a new method derived from uncertainty formulation and realize it by introducing the class-biased transformation. Then the disentangled uncertainty can be utilized to screen out noise and identify ambiguous heartbeat synchronously. The results indicate that UI-Beat can significantly improve the performance of noise detection (from 91.60% to 97.50% for real-world noise detection and from 61.40% to 82.41% for real-world artifact detection). For multi-lead ECG analysis, UI-Beat is approaching the performance upper bound in heartbeat localization (only 15 false positives and 9 false negatives out of the 175,907 heartbeats in the INCART database) and achieving a significant performance improvement in heartbeat classification through uncertainty-based cross-lead fusion compared to single-lead prediction and other state-of-the-art methods (an average improvement of 14.28% for detecting heartbeats of S and 3.37% for detecting heartbeats of V). Considering the characteristic of one-stage ECG analysis within one model, it is suggested that the proposed UI-Beat has the potential to be employed as a general model for arbitrary scenarios of ECG monitoring, with the capacity to remove invalid episodes, and realize heartbeat-level diagnosis with confidence provided.

随着心电图（ECG）监测的应用场景日益多样化，特别是可穿戴心电图的发展，诊断中模糊因素的影响也被放大。可靠的心电图信息必须从大量的噪声和混乱的伪影中提取出来。为解决这一问题，我们提出了一种受不确定性启发的节拍级诊断模型（UI-Beat）。UI-Beat 的基本架构将心跳定位和事件诊断分为两个分支，以解决异构数据源的问题。为了在确定性神经网络的一个阶段内将认识不确定性和时间不确定性分离开来，我们提出了一种源自不确定性表述的新方法，并通过引入类偏置变换来实现该方法。然后，我们就可以利用分解后的不确定性来筛选噪声，并同步识别模糊心跳。结果表明，UI-Beat 能显著提高噪声检测性能（实际噪声检测性能从 91.60% 提高到 97.50%，实际伪影检测性能从 61.40% 提高到 82.41%）。对于多导联心电图分析，UI-Beat 在心跳定位方面的性能已接近上限（在 INCART 数据库的 175,907 个心跳中，只有 15 个假阳性和 9 个假阴性），与单导联预测和其他最先进的方法相比，通过基于不确定性的跨导联融合，UI-Beat 在心跳分类方面的性能有了显著提高（检测 S 型心跳的平均性能提高了 14.28%，检测 V 型心跳的平均性能提高了 3.37%）。考虑到在一个模型内进行单阶段心电图分析的特点，建议将所提出的 UI-Beat 作为一个通用模型，用于任意场景的心电图监测，并能去除无效发作，实现具有置信度的心跳级诊断。

{"title":"Uncertainty-Inspired Multi-Task Learning in Arbitrary Scenarios of ECG Monitoring.","authors":"Xingyao Wang, Hongxiang Gao, Caiyun Ma, Tingting Zhu, Feng Yang, Chengyu Liu, Huazhu Fu","doi":"10.1109/JBHI.2025.3545927","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545927","url":null,"abstract":"As the scenarios for electrocardiogram (ECG) monitoring become increasingly diverse, particularly with the development of wearable ECG, the influence of ambiguous factors in diagnosis has been amplified. Reliable ECG information must be extracted from abundant noises and confusing artifacts. To address this issue, we suggest an uncertainty-inspired model for beat-level diagnosis (UI-Beat). The base architecture of UI-Beat separates heartbeat localization and event diagnosis in two branches to address the problem of heterogeneous data sources. To disentangle the epistemic and aleatoric uncertainty within one stage in a deterministic neural network, we propose a new method derived from uncertainty formulation and realize it by introducing the class-biased transformation. Then the disentangled uncertainty can be utilized to screen out noise and identify ambiguous heartbeat synchronously. The results indicate that UI-Beat can significantly improve the performance of noise detection (from 91.60% to 97.50% for real-world noise detection and from 61.40% to 82.41% for real-world artifact detection). For multi-lead ECG analysis, UI-Beat is approaching the performance upper bound in heartbeat localization (only 15 false positives and 9 false negatives out of the 175,907 heartbeats in the INCART database) and achieving a significant performance improvement in heartbeat classification through uncertainty-based cross-lead fusion compared to single-lead prediction and other state-of-the-art methods (an average improvement of 14.28% for detecting heartbeats of S and 3.37% for detecting heartbeats of V). Considering the characteristic of one-stage ECG analysis within one model, it is suggested that the proposed UI-Beat has the potential to be employed as a general model for arbitrary scenarios of ECG monitoring, with the capacity to remove invalid episodes, and realize heartbeat-level diagnosis with confidence provided.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-organ Segmentation from Partially Labeled and Unaligned Multi-modal MRI in Thyroid-associated Orbitopathy.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-25 DOI: 10.1109/JBHI.2025.3545138

Cheng Chen, Min Deng, Yuan Zhong, Jinyue Cai, Karen Kar Wun Chan, Qi Dou, Kelvin Kam Lung Chong, Pheng-Ann Heng, Winnie Chiu-Wing Chu

Thyroid-associated orbitopathy (TAO) is a prevalent inflammatory autoimmune disorder, leading to orbital disfigurement and visual disability. Automatic comprehensive segmentation tailored for quantitative multi-modal MRI assessment of TAO holds enormous promise but is still lacking. In this paper, we propose a novel method, named cross-modal attentive self-training (CMAST), for the multi-organ segmentation in TAO using partially labeled and unaligned multi-modal MRI data. Our method first introduces a dedicatedly designed cross-modal pseudo label self-training scheme, which leverages self-training to refine the initial pseudo labels generated by cross-modal registration, so as to complete the label sets for comprehensive segmentation. With the obtained pseudo labels, we further devise a learnable attentive fusion module to aggregate multi-modal knowledge based on learned cross-modal feature attention, which relaxes the requirement of pixel-wise alignment across modalities. A prototypical contrastive learning loss is further incorporated to facilitate cross-modal feature alignment. We evaluate our method on a large clinical TAO cohort with 100 cases of multi-modal orbital MRI. The experimental results demonstrate the promising performance of our method in achieving comprehensive segmentation of TAO-affected organs on both T1 and T1c modalities, outperforming previous methods by a large margin. Code will be released upon acceptance.

{"title":"Multi-organ Segmentation from Partially Labeled and Unaligned Multi-modal MRI in Thyroid-associated Orbitopathy.","authors":"Cheng Chen, Min Deng, Yuan Zhong, Jinyue Cai, Karen Kar Wun Chan, Qi Dou, Kelvin Kam Lung Chong, Pheng-Ann Heng, Winnie Chiu-Wing Chu","doi":"10.1109/JBHI.2025.3545138","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545138","url":null,"abstract":"Thyroid-associated orbitopathy (TAO) is a prevalent inflammatory autoimmune disorder, leading to orbital disfigurement and visual disability. Automatic comprehensive segmentation tailored for quantitative multi-modal MRI assessment of TAO holds enormous promise but is still lacking. In this paper, we propose a novel method, named cross-modal attentive self-training (CMAST), for the multi-organ segmentation in TAO using partially labeled and unaligned multi-modal MRI data. Our method first introduces a dedicatedly designed cross-modal pseudo label self-training scheme, which leverages self-training to refine the initial pseudo labels generated by cross-modal registration, so as to complete the label sets for comprehensive segmentation. With the obtained pseudo labels, we further devise a learnable attentive fusion module to aggregate multi-modal knowledge based on learned cross-modal feature attention, which relaxes the requirement of pixel-wise alignment across modalities. A prototypical contrastive learning loss is further incorporated to facilitate cross-modal feature alignment. We evaluate our method on a large clinical TAO cohort with 100 cases of multi-modal orbital MRI. The experimental results demonstrate the promising performance of our method in achieving comprehensive segmentation of TAO-affected organs on both T1 and T1c modalities, outperforming previous methods by a large margin. Code will be released upon acceptance.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive Metadata-Guided Supervised Contrastive Learning for Domain Adaptation on Respiratory Sound Classification.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-25 DOI: 10.1109/JBHI.2025.3545159

June-Woo Kim, Miika Toikkanen, Amin Jalali, Minseok Kim, Hye-Ji Han, Hyunwoo Kim, Wonwoo Shin, Ho-Young Jung, Kyunghoon Kim

Despite considerable advancements in deep learning, optimizing respiratory sound classification (RSC) models remains challenging. This is partly due to the bias from inconsistent respiratory sound recording processes and imbalanced representation of demographics, which leads to poor performance when a model trained with the dataset is applied to real-world use cases. RSC datasets usually include various metadata attributes describing certain aspects of the data, such as environmental and demographic factors. To address the issues caused by bias, we take advantage of the metadata provided by RSC datasets and explore approaches for metadata-guided domain adaptation. We thoroughly evaluate the effect of various metadata attributes and their combinations on a simple metadata-guided approach, but also introduce a more advanced method that adaptively rescales the suitable metadata combinations to improve domain adaptation during training. The findings indicate a robust reduction in domain dependency and improvement in detection accuracy on both ICBHI and our own dataset. Specifically, the implementation of our proposed methods led to an improved score of 84.97%, which signifies a substantial enhancement of 7.37% compared to the baseline model.

{"title":"Adaptive Metadata-Guided Supervised Contrastive Learning for Domain Adaptation on Respiratory Sound Classification.","authors":"June-Woo Kim, Miika Toikkanen, Amin Jalali, Minseok Kim, Hye-Ji Han, Hyunwoo Kim, Wonwoo Shin, Ho-Young Jung, Kyunghoon Kim","doi":"10.1109/JBHI.2025.3545159","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545159","url":null,"abstract":"Despite considerable advancements in deep learning, optimizing respiratory sound classification (RSC) models remains challenging. This is partly due to the bias from inconsistent respiratory sound recording processes and imbalanced representation of demographics, which leads to poor performance when a model trained with the dataset is applied to real-world use cases. RSC datasets usually include various metadata attributes describing certain aspects of the data, such as environmental and demographic factors. To address the issues caused by bias, we take advantage of the metadata provided by RSC datasets and explore approaches for metadata-guided domain adaptation. We thoroughly evaluate the effect of various metadata attributes and their combinations on a simple metadata-guided approach, but also introduce a more advanced method that adaptively rescales the suitable metadata combinations to improve domain adaptation during training. The findings indicate a robust reduction in domain dependency and improvement in detection accuracy on both ICBHI and our own dataset. Specifically, the implementation of our proposed methods led to an improved score of 84.97%, which signifies a substantial enhancement of 7.37% compared to the baseline model.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contrastive Learning Guided Fusion Network for Brain CT and MRI.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-25 DOI: 10.1109/JBHI.2025.3545172

Yuping Huang, Weisheng Li, Bin Xiao, Guofen Wang, Dan He, Xiaoyu Qiao

Medical image fusion technology provides professionals with more detailed and precise diagnostic information. This paper introduces a new efficient CT and MRI fusion network, CLGFusion, based on a contrastive learning-guided network. CLGFusion includes two encoding branches at the feature encoding stage, enabling them to interact and learn from each other. The approach begins with training a single-view encoder to predict the feature representation of an image from varied augmented views. Simultaneously, the multi-view encoder is improved using the exponential moving average of the single-view encoder. Contrastive learning is integrated into medical image fusion by creating a feature contrast space without constructing negative samples. This feature contrast space cleverly uses the information of the difference in the feature product of the source image and its corresponding augmented image. It continuously guides the network to constantly optimize its fusion effect by combining the method of structural similarity loss, to achieve more accurate and efficient image fusion. This approach represents an end-to-end unsupervised fusion model. Experimental validation shows that our proposed method demonstrates performance comparable to state-of-the-art techniques in both subjective evaluation and objective metrics.

{"title":"Contrastive Learning Guided Fusion Network for Brain CT and MRI.","authors":"Yuping Huang, Weisheng Li, Bin Xiao, Guofen Wang, Dan He, Xiaoyu Qiao","doi":"10.1109/JBHI.2025.3545172","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545172","url":null,"abstract":"Medical image fusion technology provides professionals with more detailed and precise diagnostic information. This paper introduces a new efficient CT and MRI fusion network, CLGFusion, based on a contrastive learning-guided network. CLGFusion includes two encoding branches at the feature encoding stage, enabling them to interact and learn from each other. The approach begins with training a single-view encoder to predict the feature representation of an image from varied augmented views. Simultaneously, the multi-view encoder is improved using the exponential moving average of the single-view encoder. Contrastive learning is integrated into medical image fusion by creating a feature contrast space without constructing negative samples. This feature contrast space cleverly uses the information of the difference in the feature product of the source image and its corresponding augmented image. It continuously guides the network to constantly optimize its fusion effect by combining the method of structural similarity loss, to achieve more accurate and efficient image fusion. This approach represents an end-to-end unsupervised fusion model. Experimental validation shows that our proposed method demonstrates performance comparable to state-of-the-art techniques in both subjective evaluation and objective metrics.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

COVID-BLUeS - A Prospective Study on the Value of AI in Lung Ultrasound Analysis.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-25 DOI: 10.1109/JBHI.2025.3543686

Nina Wiedemann, Dianne de Korte-de Boer, Matthias Richter, Sjors van de Weijer, Charlotte Buhre, Franz A M Eggert, Sophie Aarnoudse, Lotte Grevendonk, Steffen Rober, Carlijn M E Remie, Wolfgang Buhre, Ronald Henry, Jannis Born

As a lightweight and non-invasive imaging technique, lung ultrasound (LUS) has gained importance for assessing lung pathologies. The use of Artificial intelligence (AI) in medical decision support systems is promising due to the time- and expertise-intensive interpretation, however, due to the poor quality of existing data used for training AI models, their usability for real-world applications remains unclear.

Methods: In a prospective study, we analyze data from 63 COVID-19 suspects (33 positive) collected at Maastricht University Medical Centre. Ultrasound recordings at six body locations were acquired following the BLUE protocol and manually labeled for severity of lung involvement. Anamnesis and complete blood count (CBC) analyses were conducted. Several AI models were applied and trained for detection and severity of pulmonary infection.

Results: The severity of the lung infection, as assigned by human annotators based on the LUS videos, is not significantly different between COVID-19 positive and negative patients (). Nevertheless, the predictions of image-based AI models identify a COVID-19 infection with 65% accuracy when applied zero-shot (i.e., trained on other datasets), and up to 79% with targeted training, whereas the accuracy based on human annotations is at most 65%. Multi-modal models combining images and CBC improve significantly over image-only models.

Conclusion: Although our analysis generally supports the value of AI in LUS assessment, the evaluated models fall short of the performance expected from previous work. We find this is due to 1) the heterogeneity of LUS datasets, limiting the generalization ability to new data, 2) the frame-based processing of AI models ignoring video-level information, and 3) lack of work on multi-modal models that can extract the most relevant information from video-, image- and variable-based inputs. To aid future research, we publish the dataset at: https://github.com/NinaWie/COVID-BLUES.

{"title":"COVID-BLUeS - A Prospective Study on the Value of AI in Lung Ultrasound Analysis.","authors":"Nina Wiedemann, Dianne de Korte-de Boer, Matthias Richter, Sjors van de Weijer, Charlotte Buhre, Franz A M Eggert, Sophie Aarnoudse, Lotte Grevendonk, Steffen Rober, Carlijn M E Remie, Wolfgang Buhre, Ronald Henry, Jannis Born","doi":"10.1109/JBHI.2025.3543686","DOIUrl":"10.1109/JBHI.2025.3543686","url":null,"abstract":"As a lightweight and non-invasive imaging technique, lung ultrasound (LUS) has gained importance for assessing lung pathologies. The use of Artificial intelligence (AI) in medical decision support systems is promising due to the time- and expertise-intensive interpretation, however, due to the poor quality of existing data used for training AI models, their usability for real-world applications remains unclear.Methods: In a prospective study, we analyze data from 63 COVID-19 suspects (33 positive) collected at Maastricht University Medical Centre. Ultrasound recordings at six body locations were acquired following the BLUE protocol and manually labeled for severity of lung involvement. Anamnesis and complete blood count (CBC) analyses were conducted. Several AI models were applied and trained for detection and severity of pulmonary infection.Results: The severity of the lung infection, as assigned by human annotators based on the LUS videos, is not significantly different between COVID-19 positive and negative patients (). Nevertheless, the predictions of image-based AI models identify a COVID-19 infection with 65% accuracy when applied zero-shot (i.e., trained on other datasets), and up to 79% with targeted training, whereas the accuracy based on human annotations is at most 65%. Multi-modal models combining images and CBC improve significantly over image-only models.Conclusion: Although our analysis generally supports the value of AI in LUS assessment, the evaluated models fall short of the performance expected from previous work. We find this is due to 1) the heterogeneity of LUS datasets, limiting the generalization ability to new data, 2) the frame-based processing of AI models ignoring video-level information, and 3) lack of work on multi-modal models that can extract the most relevant information from video-, image- and variable-based inputs. To aid future research, we publish the dataset at: https://github.com/NinaWie/COVID-BLUES.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Beyond the ground truth, XGBoost model applied to sleep spindle event detection.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-24 DOI: 10.1109/JBHI.2025.3544966

Enrique Gurdiel, Fernando Vaquerizo-Villar, Javier Gomez-Pilar, Gonzalo C Gutierrez-Tobal, Felix Del Campo, Roberto Hornero

Sleep spindles are microevents of the electroencephalogram (EEG) during sleep whose functional interpretation is not fully clear. To streamline the identification process and make it more replicable, multiple automatic detectors have been proposed in the literature. Among these methods, algorithms based on deep learning usually demonstrate superior accuracy in performance assessment up to now. However, using these methods, the rationale behind the model decision-making process is hard to understand. In this study, we propose a novel machine-learning detection framework (SpinCo) based on an exhaustive sliding window feature extraction and the application of XGBoost algorithm, achieving performance close to state-of-the-art deep-learning techniques while depending on a fixed set of easily interpretable features. Additionally, we have developed a novel by-event metric for evaluation that ensures symmetricity and allows a probabilistic interpretation of the results. Through the utilization of this metric, we have enhanced the interpretability of our evaluations and enabled a direct assessment of inter-expert agreement in the manual annotation of spindle events. Finally, we propose a new type of performance assessment test based on estimations of the automatic method's ability to generalize to unseen experts and its comparison with inter-expert agreement measurements. Hence, Spinco is a robust automatic spindle detection technique that can be used for labeling raw EEG signals and shed light on the metrics used for evaluation in this problem.

{"title":"Beyond the ground truth, XGBoost model applied to sleep spindle event detection.","authors":"Enrique Gurdiel, Fernando Vaquerizo-Villar, Javier Gomez-Pilar, Gonzalo C Gutierrez-Tobal, Felix Del Campo, Roberto Hornero","doi":"10.1109/JBHI.2025.3544966","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3544966","url":null,"abstract":"Sleep spindles are microevents of the electroencephalogram (EEG) during sleep whose functional interpretation is not fully clear. To streamline the identification process and make it more replicable, multiple automatic detectors have been proposed in the literature. Among these methods, algorithms based on deep learning usually demonstrate superior accuracy in performance assessment up to now. However, using these methods, the rationale behind the model decision-making process is hard to understand. In this study, we propose a novel machine-learning detection framework (SpinCo) based on an exhaustive sliding window feature extraction and the application of XGBoost algorithm, achieving performance close to state-of-the-art deep-learning techniques while depending on a fixed set of easily interpretable features. Additionally, we have developed a novel by-event metric for evaluation that ensures symmetricity and allows a probabilistic interpretation of the results. Through the utilization of this metric, we have enhanced the interpretability of our evaluations and enabled a direct assessment of inter-expert agreement in the manual annotation of spindle events. Finally, we propose a new type of performance assessment test based on estimations of the automatic method's ability to generalize to unseen experts and its comparison with inter-expert agreement measurements. Hence, Spinco is a robust automatic spindle detection technique that can be used for labeling raw EEG signals and shed light on the metrics used for evaluation in this problem.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hierarchically Optimized Multiple Instance Learning With Multi-Magnification Pathological Images for Cerebral Tumor Diagnosis.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-24 DOI: 10.1109/JBHI.2025.3544612

Lianghui Zhu, Renao Yan, Tian Guan, Fenfen Zhang, Linlang Guo, Qiming He, Shanshan Shi, Huijuan Shi, Yonghong He, Anjia Han

Accurate diagnosis of cerebral tumors is crucial for effective clinical therapeutics and prognosis. However, limitations in brain biopsy tissues and the scarcity of pathologists specializing in cerebral tumors hinder comprehensive clinical tests for precise diagnosis. To address these challenges, we first established a brain tumor dataset of 3,520 cases collected from multiple centers. We then proposed a novel Hierarchically Optimized Multiple Instance Learning (HOMIL) method for classifying six common brain tumor types, glioma grading, and predicting the origin of brain metastatic cancers. The feature encoder and aggregator in HOMIL were trained alternately based on specific datasets and tasks. Compared to other multiple instance learning (MIL) methods, HOMIL achieved state-of-the-art performance with impressive accuracies: 93.29% / 85.60% for brain tumor classification, 91.21% / 96.93% for glioma grading, and 86.36% / 79.28% for origin determination on internal/external datasets. Additionally, HOMIL effectively located multi-scale regions of interest, enabling an in-depth analysis through features and heatmaps. Extensive visualization demonstrated HOMIL's ability to cluster features within the same type while establishing distinct boundaries between tumor types. It also identified critical areas on pathological slides, regardless of tumor size.

{"title":"Hierarchically Optimized Multiple Instance Learning With Multi-Magnification Pathological Images for Cerebral Tumor Diagnosis.","authors":"Lianghui Zhu, Renao Yan, Tian Guan, Fenfen Zhang, Linlang Guo, Qiming He, Shanshan Shi, Huijuan Shi, Yonghong He, Anjia Han","doi":"10.1109/JBHI.2025.3544612","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3544612","url":null,"abstract":"Accurate diagnosis of cerebral tumors is crucial for effective clinical therapeutics and prognosis. However, limitations in brain biopsy tissues and the scarcity of pathologists specializing in cerebral tumors hinder comprehensive clinical tests for precise diagnosis. To address these challenges, we first established a brain tumor dataset of 3,520 cases collected from multiple centers. We then proposed a novel Hierarchically Optimized Multiple Instance Learning (HOMIL) method for classifying six common brain tumor types, glioma grading, and predicting the origin of brain metastatic cancers. The feature encoder and aggregator in HOMIL were trained alternately based on specific datasets and tasks. Compared to other multiple instance learning (MIL) methods, HOMIL achieved state-of-the-art performance with impressive accuracies: 93.29% / 85.60% for brain tumor classification, 91.21% / 96.93% for glioma grading, and 86.36% / 79.28% for origin determination on internal/external datasets. Additionally, HOMIL effectively located multi-scale regions of interest, enabling an in-depth analysis through features and heatmaps. Extensive visualization demonstrated HOMIL's ability to cluster features within the same type while establishing distinct boundaries between tumor types. It also identified critical areas on pathological slides, regardless of tumor size.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Scale Spatio-Temporal Attention Network for Epileptic Seizure Prediction.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-24 DOI: 10.1109/JBHI.2025.3545265

Qiulei Dong, Han Zhang, Jun Xiao, Jiayin Sun

Epileptic seizure prediction from electroencephalogram (EEG) data has attracted much attention in the clinical diagnosis and treatment of epilepsy. Most of the existing methods in literature extract either spatial or temporal features at a single scale from EEG data, however, their learned features are generally less discriminative since the EEG data is complex and severely noisy in general, leading to low-accuracy predictions. To address this problem, we propose a Multi-scale Spatio-temporal Attention Network to learn discriminative features for seizure prediction, called MSAN, which contains a backbone module, a spatial pyramid module, and a multi-scale sequential aggregation module. The backbone module is to extract initial spatial features from the input EEG spectrograms, and the pyramid module is introduced to learn multi-scale features from the initial features. Then by taking these multi-scale features as input temporal features, the sequential aggregation module employs multiple Long Short-Term Memory(LSTM) blocks to aggregate these features. In addition, a dual-loss function is introduced to alleviate the class imbalance problem. The proposed method achieves an average sensitivity of 96.27% with a mean false prediction rate of 0.00/h on the CHB-MIT dataset and an average sensitivity of 93.57% with a mean false prediction rate of 0.044/h on the Kaggle dataset. The comparative results demonstrate that the proposed method outperforms 10 state-of-the-art epileptic seizure prediction models.

{"title":"Multi-Scale Spatio-Temporal Attention Network for Epileptic Seizure Prediction.","authors":"Qiulei Dong, Han Zhang, Jun Xiao, Jiayin Sun","doi":"10.1109/JBHI.2025.3545265","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3545265","url":null,"abstract":"Epileptic seizure prediction from electroencephalogram (EEG) data has attracted much attention in the clinical diagnosis and treatment of epilepsy. Most of the existing methods in literature extract either spatial or temporal features at a single scale from EEG data, however, their learned features are generally less discriminative since the EEG data is complex and severely noisy in general, leading to low-accuracy predictions. To address this problem, we propose a Multi-scale Spatio-temporal Attention Network to learn discriminative features for seizure prediction, called MSAN, which contains a backbone module, a spatial pyramid module, and a multi-scale sequential aggregation module. The backbone module is to extract initial spatial features from the input EEG spectrograms, and the pyramid module is introduced to learn multi-scale features from the initial features. Then by taking these multi-scale features as input temporal features, the sequential aggregation module employs multiple Long Short-Term Memory(LSTM) blocks to aggregate these features. In addition, a dual-loss function is introduced to alleviate the class imbalance problem. The proposed method achieves an average sensitivity of 96.27% with a mean false prediction rate of 0.00/h on the CHB-MIT dataset and an average sensitivity of 93.57% with a mean false prediction rate of 0.044/h on the Kaggle dataset. The comparative results demonstrate that the proposed method outperforms 10 state-of-the-art epileptic seizure prediction models.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MambaSAM: A Visual Mamba-Adapted SAM Framework for Medical Image Segmentation.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics

Pub Date : 2025-02-24 DOI: 10.1109/JBHI.2025.3544548

Pengchen Liang, Leijun Shi, Bin Pu, Renkai Wu, Jianguo Chen, Lixin Zhou, Lite Xu, Zhuangzhuang Chen, Qing Chang, Yiwei Li

The Segment Anything Model (SAM) has shown exceptional versatility in segmentation tasks across various natural image scenarios. However, its application to medical image segmentation poses significant challenges due to the intricate anatomical details and domain-specific characteristics inherent in medical images. To address these challenges, we propose a novel VMamba adapter framework that integrates a lightweight, trainable Visual Mamba (VMamba) branch with the pre-trained SAM ViT encoder. The VMamba adapter accurately captures multi-scale contextual correlations, integrates global and local information, and reduces ambiguities arising from local features only. Specifically, we propose a novel cross-branch attention (CBA) mechanism to facilitate effective interaction between the SAM and VMamba branches. This mechanism enables the model to learn and adapt more efficiently to the nuances of medical images, extracting rich, complementary features that enhance its representational capacity. Beyond architectural enhancements, we streamline the segmentation workflow by eliminating the need for prompt-driven input mechanisms. This results in an autonomous prediction model that reduces manual input requirements and improves operational efficiency. In addition, our method introduces only minimal additional trainable parameters, offering an efficient solution for medical image segmentation. Extensive evaluations of four medical image datasets demonstrate that our VMamba adapter framework achieves state-of-the-art performance. Specifically, on the ACDC dataset with limited training data, our method achieves an average Dice coefficient improvement of 0.18 and reduces the Hausdorff distance by 20.38 mm compared to the AutoSAM.

{"title":"MambaSAM: A Visual Mamba-Adapted SAM Framework for Medical Image Segmentation.","authors":"Pengchen Liang, Leijun Shi, Bin Pu, Renkai Wu, Jianguo Chen, Lixin Zhou, Lite Xu, Zhuangzhuang Chen, Qing Chang, Yiwei Li","doi":"10.1109/JBHI.2025.3544548","DOIUrl":"https://doi.org/10.1109/JBHI.2025.3544548","url":null,"abstract":"The Segment Anything Model (SAM) has shown exceptional versatility in segmentation tasks across various natural image scenarios. However, its application to medical image segmentation poses significant challenges due to the intricate anatomical details and domain-specific characteristics inherent in medical images. To address these challenges, we propose a novel VMamba adapter framework that integrates a lightweight, trainable Visual Mamba (VMamba) branch with the pre-trained SAM ViT encoder. The VMamba adapter accurately captures multi-scale contextual correlations, integrates global and local information, and reduces ambiguities arising from local features only. Specifically, we propose a novel cross-branch attention (CBA) mechanism to facilitate effective interaction between the SAM and VMamba branches. This mechanism enables the model to learn and adapt more efficiently to the nuances of medical images, extracting rich, complementary features that enhance its representational capacity. Beyond architectural enhancements, we streamline the segmentation workflow by eliminating the need for prompt-driven input mechanisms. This results in an autonomous prediction model that reduces manual input requirements and improves operational efficiency. In addition, our method introduces only minimal additional trainable parameters, offering an efficient solution for medical image segmentation. Extensive evaluations of four medical image datasets demonstrate that our VMamba adapter framework achieves state-of-the-art performance. Specifically, on the ACDC dataset with limited training data, our method achieves an average Dice coefficient improvement of 0.18 and reduces the Hausdorff distance by 20.38 mm compared to the AutoSAM.","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143541570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0