Pub Date : 2024-10-21DOI: 10.1109/JBHI.2024.3480928
Sanqian Li, Risa Higashita, Huazhu Fu, Bing Yang, Jiang Liu
Optical coherence tomography (OCT) is a widely used non-invasive imaging modality for ophthalmic diagnosis. However, the inherent speckle noise becomes the leading cause of OCT image quality, and efficient speckle removal algorithms can improve image readability and benefit automated clinical analysis. As an ill-posed inverse problem, it is of utmost importance for speckle removal to learn suitable priors. In this work, we develop a score prior guided iterative solver (SPIS) with logarithmic space to remove speckles in OCT images. Specifically, we model the posterior distribution of raw OCT images as a data consistency term and transform the speckle removal from a nonlinear into a linear inverse problem in the logarithmic domain. Subsequently, the learned prior distribution through the score function from the diffusion model is utilized as a constraint for the data consistency term into the linear inverse optimization, resulting in an iterative speckle removal procedure that alternates between the score prior predictor and the subsequent non-expansive data consistency corrector. Experimental results on the private and public OCT datasets demonstrate that the proposed SPIS has an excellent performance in speckle removal and out-of-distribution (OOD) generalization. Further downstream automatic analysis on the OCT images verifies that the proposed SPIS can benefit clinical applications. The data and code are available at https://github.com/ lisanqian1212/SPIS.
光学相干断层扫描(OCT)是一种广泛应用于眼科诊断的无创成像模式。然而,固有的斑点噪声成为影响 OCT 图像质量的主要原因,高效的斑点去除算法可以提高图像的可读性,有利于自动临床分析。作为一个难以解决的逆问题,学习合适的前验对于斑点去除至关重要。在这项工作中,我们开发了一种具有对数空间的分数先验引导迭代求解器(SPIS),用于去除 OCT 图像中的斑点。具体来说,我们将原始 OCT 图像的后验分布建模为数据一致性项,并将斑点去除从非线性问题转化为对数域的线性逆问题。随后,通过扩散模型中的分数函数学习到的先验分布被用作线性逆优化中数据一致性项的约束条件,从而形成一个在分数先验预测器和随后的非扩展数据一致性校正器之间交替进行的迭代斑点去除程序。在私有和公共 OCT 数据集上的实验结果表明,所提出的 SPIS 在斑点去除和分布外(OOD)泛化方面表现出色。对 OCT 图像的进一步下游自动分析验证了所提出的 SPIS 能为临床应用带来益处。数据和代码见 https://github.com/ lisanqian1212/SPIS。
{"title":"Score Prior Guided Iterative Solver for Speckles Removal in Optical Coherent Tomography Images.","authors":"Sanqian Li, Risa Higashita, Huazhu Fu, Bing Yang, Jiang Liu","doi":"10.1109/JBHI.2024.3480928","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3480928","url":null,"abstract":"<p><p>Optical coherence tomography (OCT) is a widely used non-invasive imaging modality for ophthalmic diagnosis. However, the inherent speckle noise becomes the leading cause of OCT image quality, and efficient speckle removal algorithms can improve image readability and benefit automated clinical analysis. As an ill-posed inverse problem, it is of utmost importance for speckle removal to learn suitable priors. In this work, we develop a score prior guided iterative solver (SPIS) with logarithmic space to remove speckles in OCT images. Specifically, we model the posterior distribution of raw OCT images as a data consistency term and transform the speckle removal from a nonlinear into a linear inverse problem in the logarithmic domain. Subsequently, the learned prior distribution through the score function from the diffusion model is utilized as a constraint for the data consistency term into the linear inverse optimization, resulting in an iterative speckle removal procedure that alternates between the score prior predictor and the subsequent non-expansive data consistency corrector. Experimental results on the private and public OCT datasets demonstrate that the proposed SPIS has an excellent performance in speckle removal and out-of-distribution (OOD) generalization. Further downstream automatic analysis on the OCT images verifies that the proposed SPIS can benefit clinical applications. The data and code are available at https://github.com/ lisanqian1212/SPIS.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142499419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evidence-based medicine (EBM) represents a paradigm of providing patient care grounded in the most current and rigorously evaluated research. Recent advances in large language models (LLMs) offer a potential solution to transform EBM by automating labor-intensive tasks and thereby improving the efficiency of clinical decision-making. This study explores integrating LLMs into the key stages in EBM, evaluating their ability across evidence retrieval (PICO extraction, biomedical question answering), synthesis (summarizing randomized controlled trials), and dissemination (medical text simplification). We conducted a comparative analysis of seven LLMs, including both proprietary and open-source models, as well as those fine-tuned on medical corpora. Specifically, we benchmarked the performance of various LLMs on each EBM task under zero-shot settings as baselines, and employed prompting techniques, including in-context learning, chain-of-thought reasoning, and knowledge-guided prompting to enhance their capabilities. Our extensive experiments revealed the strengths of LLMs, such as remarkable understanding capabilities even in zero-shot settings, strong summarization skills, and effective knowledge transfer via prompting. Promoting strategies such as knowledge-guided prompting proved highly effective (e.g., improving the performance of GPT-4 by 13.10% over zero-shot in PICO extraction). However, the experiments also showed limitations, with LLM performance falling well below state-of-the-art baselines like PubMedBERT in handling named entity recognition tasks. Moreover, human evaluation revealed persisting challenges with factual inconsistencies and domain inaccuracies, underscoring the need for rigorous quality control before clinical application. This study provides insights into enhancing EBM using LLMs while highlighting critical areas for further research. The code is publicly available on Github.
{"title":"Benchmarking Large Language Models in Evidence-Based Medicine.","authors":"Jin Li, Yiyan Deng, Qi Sun, Junjie Zhu, Yu Tian, Jingsong Li, Tingting Zhu","doi":"10.1109/JBHI.2024.3483816","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3483816","url":null,"abstract":"<p><p>Evidence-based medicine (EBM) represents a paradigm of providing patient care grounded in the most current and rigorously evaluated research. Recent advances in large language models (LLMs) offer a potential solution to transform EBM by automating labor-intensive tasks and thereby improving the efficiency of clinical decision-making. This study explores integrating LLMs into the key stages in EBM, evaluating their ability across evidence retrieval (PICO extraction, biomedical question answering), synthesis (summarizing randomized controlled trials), and dissemination (medical text simplification). We conducted a comparative analysis of seven LLMs, including both proprietary and open-source models, as well as those fine-tuned on medical corpora. Specifically, we benchmarked the performance of various LLMs on each EBM task under zero-shot settings as baselines, and employed prompting techniques, including in-context learning, chain-of-thought reasoning, and knowledge-guided prompting to enhance their capabilities. Our extensive experiments revealed the strengths of LLMs, such as remarkable understanding capabilities even in zero-shot settings, strong summarization skills, and effective knowledge transfer via prompting. Promoting strategies such as knowledge-guided prompting proved highly effective (e.g., improving the performance of GPT-4 by 13.10% over zero-shot in PICO extraction). However, the experiments also showed limitations, with LLM performance falling well below state-of-the-art baselines like PubMedBERT in handling named entity recognition tasks. Moreover, human evaluation revealed persisting challenges with factual inconsistencies and domain inaccuracies, underscoring the need for rigorous quality control before clinical application. This study provides insights into enhancing EBM using LLMs while highlighting critical areas for further research. The code is publicly available on Github.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142499410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1109/JBHI.2024.3483812
Yu Li, Lin-Xuan Hou, Zhu-Hong You, Yang Yuan, Cheng-Gang Mi, Yu-An Huang, Hai-Cheng Yi
Predicting drug-drug interactions (DDIs) is a significant concern in the field of deep learning. It can effectively reduce potential adverse consequences and improve therapeutic safety. Graph neural network (GNN)-based models have made satisfactory progress in DDI event prediction. However, most existing models overlook crucial drug structure and interaction information, which is necessary for accurate DDI event prediction. To tackle this issue, we introduce a new method called MRGCDDI. This approach employs contrastive learning, but unlike conventional methods, it does not require data augmentation, thereby avoiding additional noise. MRGCDDI maintains the semantics of the graphical data during encoder perturbation through a simple yet effective contrastive learning approach, without the need for manual trial and error, tedious searching, or expensive domain knowledge to select enhancements. The approach presented in this study effectively integrates drug features extracted from drug molecular graphs and information from multi-relational drug-drug interaction (DDI) networks. Extensive experimental results demonstrate that MRGCDDI outperforms state-of-the-art methods on both datasets. Specifically, on Deng's dataset, MRGCDDI achieves an average increase of 4.33% in accuracy, 11.57% in Macro-F1, 10.97% in Macro-Recall, and 10.64% in Macro-Precision. Similarly, on Ryu's dataset, the model shows improvements with an average increase of 2.42% in accuracy, 3.86% in Macro-F1, 3.49% in Macro-Recall, and 2.75% in Macro-Precision. All the data and codes of this work are available at https://github.com/Nokeli/MRGCDDI.
{"title":"MRGCDDI: Multi-Relation Graph Contrastive Learning without Data Augmentation for Drug-Drug Interaction Events Prediction.","authors":"Yu Li, Lin-Xuan Hou, Zhu-Hong You, Yang Yuan, Cheng-Gang Mi, Yu-An Huang, Hai-Cheng Yi","doi":"10.1109/JBHI.2024.3483812","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3483812","url":null,"abstract":"<p><p>Predicting drug-drug interactions (DDIs) is a significant concern in the field of deep learning. It can effectively reduce potential adverse consequences and improve therapeutic safety. Graph neural network (GNN)-based models have made satisfactory progress in DDI event prediction. However, most existing models overlook crucial drug structure and interaction information, which is necessary for accurate DDI event prediction. To tackle this issue, we introduce a new method called MRGCDDI. This approach employs contrastive learning, but unlike conventional methods, it does not require data augmentation, thereby avoiding additional noise. MRGCDDI maintains the semantics of the graphical data during encoder perturbation through a simple yet effective contrastive learning approach, without the need for manual trial and error, tedious searching, or expensive domain knowledge to select enhancements. The approach presented in this study effectively integrates drug features extracted from drug molecular graphs and information from multi-relational drug-drug interaction (DDI) networks. Extensive experimental results demonstrate that MRGCDDI outperforms state-of-the-art methods on both datasets. Specifically, on Deng's dataset, MRGCDDI achieves an average increase of 4.33% in accuracy, 11.57% in Macro-F1, 10.97% in Macro-Recall, and 10.64% in Macro-Precision. Similarly, on Ryu's dataset, the model shows improvements with an average increase of 2.42% in accuracy, 3.86% in Macro-F1, 3.49% in Macro-Recall, and 2.75% in Macro-Precision. All the data and codes of this work are available at https://github.com/Nokeli/MRGCDDI.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142499415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1109/JBHI.2024.3483999
Hoda Nemat, Heydar Khadem, Jackie Elliott, Mohammed Benaissa
Blood glucose level (BGL) prediction contributes to more effective management of type 1 diabetes. Physical activity (PA) is a crucial factor in diabetes management. It affects BGL, and it is imperative to effectively deploy PA in BGL prediction to support diabetes management systems by incorporating this crucial factor. Due to the erratic nature of PA's impact on BGL inter- and intra-patients and insufficient knowledge, deploying PA in BGL prediction is challenging. Hence, optimal approaches for PA fusion with BGL are demanded to improve the performance of BGL prediction. To address this gap, we propose novel methodologies for extracting and integrating information from PA data into BGL prediction. This paper proposes several novel PA-informed prediction models by developing different approaches for extracting information from PA data and fusing this information with BGL data in signal, feature, and decision levels to find the optimal approach for deploying PA in BGL prediction models. For signal-level fusion, different automatically-recorded PA data are fused with BGL data. Also, three feature engineering approaches are developed for feature-level fusion: subjective assessments of PA, objective assessments of PA, and statistics of PA. Furthermore, in decision-level fusion, ensemble learning is used to combine predictions from models trained with different inputs. Then, a comparative investigation is performed between the developed PA-informed approaches and the no-fusion approach, as well as between themselves. The analyses are performed on the publicly available Ohio dataset with rigorous evaluation. The results show that deploying PA can statistically significantly improve BGL prediction performance. The results show that deploying PA can statistically significantly improve BGL prediction performance. Also, among the developed approaches to leveraging PA in BGL prediction, fusing heart rate data at the signal-level and PA intensity categories at the feature-level with BGL data are the most effective ways. Our developed methodologies contribute to determining optimal approaches, including the kind of PA information and fusion method, to improve the performance of BGL prediction effectively.
预测血糖水平(BGL)有助于更有效地管理 1 型糖尿病。体力活动(PA)是糖尿病管理的一个关键因素。它影响血糖水平,因此必须在血糖水平预测中有效利用体力活动,通过纳入这一关键因素来支持糖尿病管理系统。由于患者之间和患者内部的 PA 对血糖胆固醇的影响不稳定,加之知识不足,在血糖胆固醇预测中应用 PA 具有挑战性。因此,需要将 PA 与 BGL 融合的最佳方法来提高 BGL 预测的性能。针对这一差距,我们提出了从 PA 数据中提取信息并将其整合到 BGL 预测中的新方法。本文通过开发从 PA 数据中提取信息并将这些信息与 BGL 数据在信号、特征和决策层面进行融合的不同方法,提出了几种新型 PA 信息预测模型,以找到在 BGL 预测模型中部署 PA 的最佳方法。在信号级融合方面,将不同的自动记录 PA 数据与 BGL 数据进行融合。此外,还为特征级融合开发了三种特征工程方法:PA 的主观评估、PA 的客观评估和 PA 的统计。此外,在决策级融合中,采用了集合学习的方法,将不同输入训练的模型的预测结果结合起来。然后,对已开发的 PA 信息方法和无融合方法以及它们之间进行了比较研究。分析在公开的俄亥俄州数据集上进行,并进行了严格的评估。结果表明,采用 PA 可以在统计上显著提高 BGL 预测性能。结果表明,从统计学角度看,部署 PA 可以显著提高 BGL 预测性能。此外,在已开发的利用 PA 进行 BGL 预测的方法中,信号级心率数据和特征级 PA 强度类别与 BGL 数据的融合是最有效的方法。我们开发的方法有助于确定最佳方法,包括 PA 信息的种类和融合方法,从而有效提高 BGL 预测的性能。
{"title":"Physical Activity Integration in Blood Glucose Level Prediction: Different Levels of Data Fusion.","authors":"Hoda Nemat, Heydar Khadem, Jackie Elliott, Mohammed Benaissa","doi":"10.1109/JBHI.2024.3483999","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3483999","url":null,"abstract":"<p><p>Blood glucose level (BGL) prediction contributes to more effective management of type 1 diabetes. Physical activity (PA) is a crucial factor in diabetes management. It affects BGL, and it is imperative to effectively deploy PA in BGL prediction to support diabetes management systems by incorporating this crucial factor. Due to the erratic nature of PA's impact on BGL inter- and intra-patients and insufficient knowledge, deploying PA in BGL prediction is challenging. Hence, optimal approaches for PA fusion with BGL are demanded to improve the performance of BGL prediction. To address this gap, we propose novel methodologies for extracting and integrating information from PA data into BGL prediction. This paper proposes several novel PA-informed prediction models by developing different approaches for extracting information from PA data and fusing this information with BGL data in signal, feature, and decision levels to find the optimal approach for deploying PA in BGL prediction models. For signal-level fusion, different automatically-recorded PA data are fused with BGL data. Also, three feature engineering approaches are developed for feature-level fusion: subjective assessments of PA, objective assessments of PA, and statistics of PA. Furthermore, in decision-level fusion, ensemble learning is used to combine predictions from models trained with different inputs. Then, a comparative investigation is performed between the developed PA-informed approaches and the no-fusion approach, as well as between themselves. The analyses are performed on the publicly available Ohio dataset with rigorous evaluation. The results show that deploying PA can statistically significantly improve BGL prediction performance. The results show that deploying PA can statistically significantly improve BGL prediction performance. Also, among the developed approaches to leveraging PA in BGL prediction, fusing heart rate data at the signal-level and PA intensity categories at the feature-level with BGL data are the most effective ways. Our developed methodologies contribute to determining optimal approaches, including the kind of PA information and fusion method, to improve the performance of BGL prediction effectively.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142499417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1109/JBHI.2024.3483316
Qian Gao, Tao Xu, Xiaodi Li, Wanling Gao, Haoyuan Shi, Youhua Zhang, Jie Chen, Zhenyu Yue
Tumor heterogeneity presents a significant challenge in predicting drug responses, especially as missense mutations within the same gene can lead to varied outcomes such as drug resistance, enhanced sensitivity, or therapeutic ineffectiveness. These complex relationships highlight the need for advanced analytical approaches in oncology. Due to their powerful ability to handle heterogeneous data, graph convolutional networks (GCNs) represent a promising approach for predicting drug responses. However, simple bipartite graphs cannot accurately capture the complex relationships involved in missense mutation and drug response. Furthermore, Deep learning models for drug response are often considered "black boxes", and their interpretability remains a widely discussed issue. To address these challenges, we propose an Interpretable Dynamic Directed Graph Convolutional Network (IDDGCN) framework, which incorporates four key features: (1) the use of directed graphs to differentiate between sensitivity and resistance relationships, (2) the dynamic updating of node weights based on node-specific interactions, (3) the exploration of associations between different mutations within the same gene and drug response, and (4) the enhancement of interpretability models through the integration of a weighted mechanism that accounts for the biological significance, alongside a ground truth construction method to evaluate prediction transparency. The experimental results demonstrate that IDDGCN outperforms existing state-of-the-art models, exhibiting excellent predictive power. Both qualitative and quantitative evaluations of its interpretability further highlight its ability to explain predictions, offering a fresh perspective for precision oncology and targeted drug development.
{"title":"Interpretable Dynamic Directed Graph Convolutional Network for Multi-Relational Prediction of Missense Mutation and Drug Response.","authors":"Qian Gao, Tao Xu, Xiaodi Li, Wanling Gao, Haoyuan Shi, Youhua Zhang, Jie Chen, Zhenyu Yue","doi":"10.1109/JBHI.2024.3483316","DOIUrl":"10.1109/JBHI.2024.3483316","url":null,"abstract":"<p><p>Tumor heterogeneity presents a significant challenge in predicting drug responses, especially as missense mutations within the same gene can lead to varied outcomes such as drug resistance, enhanced sensitivity, or therapeutic ineffectiveness. These complex relationships highlight the need for advanced analytical approaches in oncology. Due to their powerful ability to handle heterogeneous data, graph convolutional networks (GCNs) represent a promising approach for predicting drug responses. However, simple bipartite graphs cannot accurately capture the complex relationships involved in missense mutation and drug response. Furthermore, Deep learning models for drug response are often considered \"black boxes\", and their interpretability remains a widely discussed issue. To address these challenges, we propose an Interpretable Dynamic Directed Graph Convolutional Network (IDDGCN) framework, which incorporates four key features: (1) the use of directed graphs to differentiate between sensitivity and resistance relationships, (2) the dynamic updating of node weights based on node-specific interactions, (3) the exploration of associations between different mutations within the same gene and drug response, and (4) the enhancement of interpretability models through the integration of a weighted mechanism that accounts for the biological significance, alongside a ground truth construction method to evaluate prediction transparency. The experimental results demonstrate that IDDGCN outperforms existing state-of-the-art models, exhibiting excellent predictive power. Both qualitative and quantitative evaluations of its interpretability further highlight its ability to explain predictions, offering a fresh perspective for precision oncology and targeted drug development.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study introduces an innovative deep-learning model for cuffless blood pressure estimation using PPG and ECG signals, demonstrating state-of-the-art performance on the largest clean dataset, PulseDB. The rU-Net architecture, a fusion of U-Net and ResNet, enhances both generalization and feature extraction accuracy. Accurate multi-scale feature capture is facilitated by short-time Fourier transform (STFT) time-frequency distributions and multi-head attention mechanisms, allowing data-driven feature selection. The inclusion of demographic parameters as supervisory information further elevates performance. On the calibration-based dataset, our model excels, achieving outstanding accuracy (SBP MAE ± std: 4.49 ± 4.86 mmHg, DBP MAE ± std: 2.69 ± 3.10 mmHg), surpassing AAMI standards and earning a BHS Grade A rating. Addressing the challenge of calibration-free data, we propose a fine-tuning-based transfer learning approach. Remarkably, with only 10% data transfer, our model attains exceptional accuracy (SBP MAE ± std: 4.14 ± 5.01 mmHg, DBP MAE ± std: 2.48 ± 2.93 mmHg). This study sets the stage for the development of highly accurate and reliable wearable cuffless blood pressure monitoring devices.
本研究介绍了一种利用 PPG 和 ECG 信号进行无袖带血压估算的创新型深度学习模型,在最大的清洁数据集 PulseDB 上展示了最先进的性能。融合了 U-Net 和 ResNet 的 rU-Net 架构提高了泛化和特征提取的准确性。短时傅立叶变换 (STFT) 时频分布和多头关注机制有助于准确捕捉多尺度特征,从而实现数据驱动的特征选择。将人口统计参数作为监督信息,可进一步提高性能。在基于校准的数据集上,我们的模型表现出色,实现了出色的准确性(SBP MAE ± std:4.49 ± 4.86 mmHg,DBP MAE ± std:2.69 ± 3.10 mmHg),超过了 AAMI 标准,并获得了 BHS A 级评级。为了应对无校准数据的挑战,我们提出了一种基于微调的迁移学习方法。值得注意的是,只需传输 10% 的数据,我们的模型就能达到极高的准确度(SBP MAE ± std:4.14 ± 5.01 mmHg,DBP MAE ± std:2.48 ± 2.93 mmHg)。这项研究为开发高度准确可靠的可穿戴式无袖带血压监测设备奠定了基础。
{"title":"rU-Net, Multi-Scale Feature Fusion and Transfer Learning: Unlocking the Potential of Cuffless Blood Pressure Monitoring with PPG and ECG.","authors":"Jiaming Chen, Xueling Zhou, Lei Feng, Bingo Wing-Kuen Ling, Lianyi Han, Hongtao Zhang","doi":"10.1109/JBHI.2024.3483301","DOIUrl":"10.1109/JBHI.2024.3483301","url":null,"abstract":"<p><p>This study introduces an innovative deep-learning model for cuffless blood pressure estimation using PPG and ECG signals, demonstrating state-of-the-art performance on the largest clean dataset, PulseDB. The rU-Net architecture, a fusion of U-Net and ResNet, enhances both generalization and feature extraction accuracy. Accurate multi-scale feature capture is facilitated by short-time Fourier transform (STFT) time-frequency distributions and multi-head attention mechanisms, allowing data-driven feature selection. The inclusion of demographic parameters as supervisory information further elevates performance. On the calibration-based dataset, our model excels, achieving outstanding accuracy (SBP MAE ± std: 4.49 ± 4.86 mmHg, DBP MAE ± std: 2.69 ± 3.10 mmHg), surpassing AAMI standards and earning a BHS Grade A rating. Addressing the challenge of calibration-free data, we propose a fine-tuning-based transfer learning approach. Remarkably, with only 10% data transfer, our model attains exceptional accuracy (SBP MAE ± std: 4.14 ± 5.01 mmHg, DBP MAE ± std: 2.48 ± 2.93 mmHg). This study sets the stage for the development of highly accurate and reliable wearable cuffless blood pressure monitoring devices.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-17DOI: 10.1109/JBHI.2024.3482569
Dongmin Huang, Yongshen Zeng, Yingen Zhu, Xiaoyan Song, Liping Pan, Jie Yang, Yanrong Wang, Hongzhou Lu, Wenjin Wang
Existing respiratory monitoring techniques primarily focus on respiratory rate measurement, neglecting the potential of using thoracoabdominal patterns of respiration for infant lung health assessment. To bridge this gap, we exploit the unique advantage of spatial redundancy of a camera sensor to analyze the infant thoracoabdominal respiratory motion. Specifically, we propose a camera-based respiratory imaging (CRI) system that utilizes optical flow to construct a spatio-temporal respiratory imager for comparing the infant chest and abdominal respiratory motion, and employs deep learning algorithms to identify infant abdominal, thoracoabdominal synchronous, and thoracoabdominal asynchronous patterns of respiration. To alleviate the challenges posed by limited clinical training data and subject variability, we introduce a novel multiple-expert contrastive learning (MECL) strategy to CRI. It enriches training samples by reversing and pairing different-class data, and promotes the representation consistency of same-class data through multi-expert collaborative optimization. Clinical validation involving 44 infants shows that MECL achieves 70% in sensitivity and 80.21% in specificity, which validates the feasibility of CRI for respiratory pattern recognition. This work investigates a novel video-based approach for assessing the infant thoracoabdominal patterns of respiration, revealing a new value stream of video health monitoring in neonatal care.
{"title":"Camera-Based Respiratory Imaging System for Monitoring Infant Thoracoabdominal Patterns of Respiration.","authors":"Dongmin Huang, Yongshen Zeng, Yingen Zhu, Xiaoyan Song, Liping Pan, Jie Yang, Yanrong Wang, Hongzhou Lu, Wenjin Wang","doi":"10.1109/JBHI.2024.3482569","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3482569","url":null,"abstract":"<p><p>Existing respiratory monitoring techniques primarily focus on respiratory rate measurement, neglecting the potential of using thoracoabdominal patterns of respiration for infant lung health assessment. To bridge this gap, we exploit the unique advantage of spatial redundancy of a camera sensor to analyze the infant thoracoabdominal respiratory motion. Specifically, we propose a camera-based respiratory imaging (CRI) system that utilizes optical flow to construct a spatio-temporal respiratory imager for comparing the infant chest and abdominal respiratory motion, and employs deep learning algorithms to identify infant abdominal, thoracoabdominal synchronous, and thoracoabdominal asynchronous patterns of respiration. To alleviate the challenges posed by limited clinical training data and subject variability, we introduce a novel multiple-expert contrastive learning (MECL) strategy to CRI. It enriches training samples by reversing and pairing different-class data, and promotes the representation consistency of same-class data through multi-expert collaborative optimization. Clinical validation involving 44 infants shows that MECL achieves 70% in sensitivity and 80.21% in specificity, which validates the feasibility of CRI for respiratory pattern recognition. This work investigates a novel video-based approach for assessing the infant thoracoabdominal patterns of respiration, revealing a new value stream of video health monitoring in neonatal care.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-17DOI: 10.1109/JBHI.2024.3483577
Sumit Dalal, Deepa Tilwani, Manas Gaur, Sarika Jain, Valerie L Shalin, Amit P Sheth
The lack of explainability in using relevant clinical knowledge hinders the adoption of artificial intelligence-powered analysis of unstructured clinical dialogue. A wealth of relevant, untapped Mental Health (MH) data is available in online communities, providing the opportunity to address the explainability problem with substantial potential impact as a screening tool for both online and offline applications. Inspired by how clinicians rely on their expertise when interacting with patients, we leverage relevant clinical knowledge to classify and explain depression-related data, reducing manual review time and engendering trust. We developed a method to enhance attention in contemporary transformer models and generate explanations for classifications that are understandable by mental health practitioners (MHPs) by incorporating external clinical knowledge. We propose a domain-general architecture called ProcesS knowledgeinfused cross ATtention (PSAT) that incorporates clinical practice guidelines (CPG) when computing attention. We transform a CPG resource focused on depression, such as the Patient Health Questionnaire (e.g. PHQ-9) and related questions, into a machine-readable ontology using SNOMED-CT. With this resource, PSAT enhances the ability of models like GPT-3.5 to generate application-relevant explanations. Evaluation of four expert-curated datasets related to depression demonstrates PSAT's applicationrelevant explanations. PSAT surpasses the performance of twelve baseline models and can provide explanations where other baselines fall short.
{"title":"A Cross Attention Approach to Diagnostic Explainability Using Clinical Practice Guidelines for Depression.","authors":"Sumit Dalal, Deepa Tilwani, Manas Gaur, Sarika Jain, Valerie L Shalin, Amit P Sheth","doi":"10.1109/JBHI.2024.3483577","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3483577","url":null,"abstract":"<p><p>The lack of explainability in using relevant clinical knowledge hinders the adoption of artificial intelligence-powered analysis of unstructured clinical dialogue. A wealth of relevant, untapped Mental Health (MH) data is available in online communities, providing the opportunity to address the explainability problem with substantial potential impact as a screening tool for both online and offline applications. Inspired by how clinicians rely on their expertise when interacting with patients, we leverage relevant clinical knowledge to classify and explain depression-related data, reducing manual review time and engendering trust. We developed a method to enhance attention in contemporary transformer models and generate explanations for classifications that are understandable by mental health practitioners (MHPs) by incorporating external clinical knowledge. We propose a domain-general architecture called ProcesS knowledgeinfused cross ATtention (PSAT) that incorporates clinical practice guidelines (CPG) when computing attention. We transform a CPG resource focused on depression, such as the Patient Health Questionnaire (e.g. PHQ-9) and related questions, into a machine-readable ontology using SNOMED-CT. With this resource, PSAT enhances the ability of models like GPT-3.5 to generate application-relevant explanations. Evaluation of four expert-curated datasets related to depression demonstrates PSAT's applicationrelevant explanations. PSAT surpasses the performance of twelve baseline models and can provide explanations where other baselines fall short.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-17DOI: 10.1109/JBHI.2024.3482853
Xiaoyan Yuan, Wei Wang, Xiaohe Li, Yuanting Zhang, Xiping Hu, M Jamal Deen
Electrocardiography (ECG) is the gold standard for monitoring heart function and is crucial for preventing the worsening of cardiovascular diseases (CVDs). However, the inconvenience of ECG acquisition poses challenges for long-term continuous monitoring. Consequently, researchers have explored non-invasive and easily accessible photoplethysmography (PPG) as an alternative, converting it into ECG. Previous studies have focused on peaks or simple mapping to generate ECG, ignoring the inherent periodicity of cardiovascular signals. This results in an inability to accurately extract physiological information during the cycle, thus compromising the generated ECG signals' clinical utility. To this end, we introduce a novel PPG-to-ECG translation model called CATransformer, capable of adaptive modeling based on the cardiac cycle. Specifically, CATransformer automatically extracts the cycle using a cycle-aware module and creates multiple semantic views of the cardiac cycle. It leverages a transformer to capture detailed features within each cycle and the dynamics across cycles. Our method outperforms existing approaches, exhibiting the lowest RMSE across five paired PPG-ECG databases. Additionally, extensive experiments are conducted on four cardiovascular-related tasks to assess the clinical utility of the generated ECG, achieving consistent state-of-the-art performance. Experimental results confirm that CATransformer generates highly faithful ECG signals while preserving their physiological characteristics.
{"title":"CATransformer: A Cycle-Aware Transformer for High-Fidelity ECG Generation From PPG.","authors":"Xiaoyan Yuan, Wei Wang, Xiaohe Li, Yuanting Zhang, Xiping Hu, M Jamal Deen","doi":"10.1109/JBHI.2024.3482853","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3482853","url":null,"abstract":"<p><p>Electrocardiography (ECG) is the gold standard for monitoring heart function and is crucial for preventing the worsening of cardiovascular diseases (CVDs). However, the inconvenience of ECG acquisition poses challenges for long-term continuous monitoring. Consequently, researchers have explored non-invasive and easily accessible photoplethysmography (PPG) as an alternative, converting it into ECG. Previous studies have focused on peaks or simple mapping to generate ECG, ignoring the inherent periodicity of cardiovascular signals. This results in an inability to accurately extract physiological information during the cycle, thus compromising the generated ECG signals' clinical utility. To this end, we introduce a novel PPG-to-ECG translation model called CATransformer, capable of adaptive modeling based on the cardiac cycle. Specifically, CATransformer automatically extracts the cycle using a cycle-aware module and creates multiple semantic views of the cardiac cycle. It leverages a transformer to capture detailed features within each cycle and the dynamics across cycles. Our method outperforms existing approaches, exhibiting the lowest RMSE across five paired PPG-ECG databases. Additionally, extensive experiments are conducted on four cardiovascular-related tasks to assess the clinical utility of the generated ECG, achieving consistent state-of-the-art performance. Experimental results confirm that CATransformer generates highly faithful ECG signals while preserving their physiological characteristics.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16DOI: 10.1109/JBHI.2024.3482450
Muhammad Hameed Siddiqi, Irshad Ahmad, Yousef Alhwaiti, Faheem Khan
Facial expressions vary with different health conditions, making a facial expression recognition (FER) system valuable within a healthcare framework. Achieving accurate recognition of facial expressions is a considerable challenge due to the difficulty in capturing subtle features. This research introduced an ensemble neural random forest method that utilizes convolutional neural network (CNN) architecture for feature extraction and optimized random forest for classification. For feature extraction, four convolutional layers with different numbers of filters and kernel sizes are used. Further, the maxpooling, batch normalization, and dropout layers are used in the model to expedite the process of feature extraction and avoid the overfitting of the model. The extracted features are provided to the optimized random forest for classification, which is based on the number of trees, criterion, maximum tree depth, maximum terminal nodes, minimum sample split, and maximum features per tree, and applied to the classification process. To demonstrate the significance of the proposed model, we conducted a thorough assessment of the proposed neural random forest through an extensive experiment encompassing six publicly available datasets. The remarkable weighted average recognition rate of 97.3% achieved across these diverse datasets highlights the effectiveness of our approach in the context of FER systems.
面部表情会随着不同的健康状况而变化,因此面部表情识别(FER)系统在医疗保健框架内非常有价值。由于难以捕捉细微特征,因此实现面部表情的准确识别是一项相当大的挑战。这项研究引入了一种集合神经随机森林方法,利用卷积神经网络(CNN)架构进行特征提取,并利用优化的随机森林进行分类。在特征提取方面,使用了四个具有不同数量过滤器和内核大小的卷积层。此外,模型中还使用了 maxpooling、batch normalization 和 dropout 层,以加快特征提取过程,避免模型的过度拟合。提取的特征将提供给优化的随机森林进行分类,该分类基于树的数量、准则、最大树深、最大终端节点、最小样本分割和每棵树的最大特征,并应用于分类过程。为了证明所提模型的重要意义,我们通过一项包含六个公开数据集的广泛实验,对所提神经随机森林进行了全面评估。这些不同数据集的加权平均识别率高达 97.3%,这充分证明了我们的方法在 FER 系统中的有效性。
{"title":"Facial Expression Recognition for Healthcare Monitoring Systems Using Neural Random Forest.","authors":"Muhammad Hameed Siddiqi, Irshad Ahmad, Yousef Alhwaiti, Faheem Khan","doi":"10.1109/JBHI.2024.3482450","DOIUrl":"https://doi.org/10.1109/JBHI.2024.3482450","url":null,"abstract":"<p><p>Facial expressions vary with different health conditions, making a facial expression recognition (FER) system valuable within a healthcare framework. Achieving accurate recognition of facial expressions is a considerable challenge due to the difficulty in capturing subtle features. This research introduced an ensemble neural random forest method that utilizes convolutional neural network (CNN) architecture for feature extraction and optimized random forest for classification. For feature extraction, four convolutional layers with different numbers of filters and kernel sizes are used. Further, the maxpooling, batch normalization, and dropout layers are used in the model to expedite the process of feature extraction and avoid the overfitting of the model. The extracted features are provided to the optimized random forest for classification, which is based on the number of trees, criterion, maximum tree depth, maximum terminal nodes, minimum sample split, and maximum features per tree, and applied to the classification process. To demonstrate the significance of the proposed model, we conducted a thorough assessment of the proposed neural random forest through an extensive experiment encompassing six publicly available datasets. The remarkable weighted average recognition rate of 97.3% achieved across these diverse datasets highlights the effectiveness of our approach in the context of FER systems.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142464063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}