首页 > 最新文献

Journal of Biomedical Informatics最新文献

英文 中文
Rare disease diagnosis using knowledge guided retrieval augmentation for ChatGPT 在 ChatGPT 中使用知识引导检索增强技术进行罕见疾病诊断。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-29 DOI: 10.1016/j.jbi.2024.104702
Charlotte Zelin , Wendy K. Chung , Mederic Jeanne , Gongbo Zhang , Chunhua Weng

Although rare diseases individually have a low prevalence, they collectively affect nearly 400 million individuals around the world. On average, it takes five years for an accurate rare disease diagnosis, but many patients remain undiagnosed or misdiagnosed. As machine learning technologies have been used to aid diagnostics in the past, this study aims to test ChatGPT’s suitability for rare disease diagnostic support with the enhancement provided by Retrieval Augmented Generation (RAG). RareDxGPT, our enhanced ChatGPT model, supplies ChatGPT with information about 717 rare diseases from an external knowledge resource, the RareDis Corpus, through RAG. In RareDxGPT, when a query is entered, the three documents most relevant to the query in the RareDis Corpus are retrieved. Along with the query, they are returned to ChatGPT to provide a diagnosis. Additionally, phenotypes for thirty different diseases were extracted from free text from PubMed’s Case Reports. They were each entered with three different prompt types: “prompt”, “prompt + explanation” and “prompt + role play.” The accuracy of ChatGPT and RareDxGPT with each prompt was then measured. With “Prompt”, RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 37 % of the cases correct. With “Prompt + Explanation”, RareDxGPT had a 43 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. With “Prompt + Role Play”, RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. To conclude, ChatGPT, especially when supplying extra domain specific knowledge, demonstrates early potential for rare disease diagnosis with adjustments.

虽然罕见病的单个发病率很低,但它们总共影响着全球近 4 亿人。平均而言,罕见病的准确诊断需要五年时间,但许多患者仍未得到诊断或被误诊。由于机器学习技术过去曾被用于辅助诊断,本研究旨在测试 ChatGPT 在检索增强生成(RAG)技术的增强下是否适用于罕见病诊断支持。RareDxGPT 是我们的增强型 ChatGPT 模型,它通过 RAG 从外部知识资源 RareDis 语料库中为 ChatGPT 提供了 717 种罕见病的信息。在 RareDxGPT 中,当输入一个查询时,RareDis 语料库中与该查询最相关的三个文档将被检索出来。它们与查询一起返回到 ChatGPT,以提供诊断结果。此外,还从 PubMed 病例报告的自由文本中提取了 30 种不同疾病的表型。每种疾病都有三种不同的提示类型:"提示"、"提示+解释 "和 "提示+角色扮演"。然后测量了 ChatGPT 和 RareDxGPT 对每种提示的准确性。使用 "提示 "时,RareDxGPT 的准确率为 40%,而 ChatGPT 3.5 的正确率为 37%。使用 "提示+解释 "时,RareDxGPT 的准确率为 43%,而 ChatGPT 3.5 的正确率为 23%。使用 "提示+角色扮演 "时,RareDxGPT 的正确率为 40%,而 ChatGPT 3.5 的正确率为 23%。总之,ChatGPT,尤其是在提供额外的特定领域知识时,展示了通过调整进行罕见病诊断的早期潜力。
{"title":"Rare disease diagnosis using knowledge guided retrieval augmentation for ChatGPT","authors":"Charlotte Zelin ,&nbsp;Wendy K. Chung ,&nbsp;Mederic Jeanne ,&nbsp;Gongbo Zhang ,&nbsp;Chunhua Weng","doi":"10.1016/j.jbi.2024.104702","DOIUrl":"10.1016/j.jbi.2024.104702","url":null,"abstract":"<div><p>Although rare diseases individually have a low prevalence, they collectively affect nearly 400 million individuals around the world. On average, it takes five years for an accurate rare disease diagnosis, but many patients remain undiagnosed or misdiagnosed. As machine learning technologies have been used to aid diagnostics in the past, this study aims to test ChatGPT’s suitability for rare disease diagnostic support with the enhancement provided by Retrieval Augmented Generation (RAG). RareDxGPT, our enhanced ChatGPT model, supplies ChatGPT with information about 717 rare diseases from an external knowledge resource, the RareDis Corpus, through RAG. In RareDxGPT, when a query is entered, the three documents most relevant to the query in the RareDis Corpus are retrieved. Along with the query, they are returned to ChatGPT to provide a diagnosis. Additionally, phenotypes for thirty different diseases were extracted from free text from PubMed’s Case Reports. They were each entered with three different prompt types: “prompt”, “prompt + explanation” and “prompt + role play.” The accuracy of ChatGPT and RareDxGPT with each prompt was then measured. With “Prompt”, RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 37 % of the cases correct. With “Prompt + Explanation”, RareDxGPT had a 43 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. With “Prompt + Role Play”, RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. To conclude, ChatGPT, especially when supplying extra domain specific knowledge, demonstrates early potential for rare disease diagnosis with adjustments.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104702"},"PeriodicalIF":4.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141859869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Desiderata for discoverability and FAIR adoption of health data hubs 健康数据中心的可发现性和 FAIR 采用的预期目标
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-28 DOI: 10.1016/j.jbi.2024.104700
Celia Alvarez-Romero , Máximo Bernabeu-Wittel , Carlos Luis Parra-Calderón , Silvia Rodríguez Mejías , Alicia Martínez-García

Background

The future European Health Research and Innovation Cloud (HRIC), as fundamental part of the European Health Data Space (EHDS), will promote the secondary use of data and the capabilities to push the boundaries of health research within an ethical and legally compliant framework that reinforces the trust of patients and citizens.

Objective

This study aimed to analyse health data management mechanisms in Europe to determine their alignment with FAIR principles and data discovery generating best.

practices for new data hubs joining the HRIC ecosystem. In this line, the compliance of health data hubs with FAIR principles and data discovery were assessed, and a set of best practices for health data hubs was concluded.

Methods

A survey was conducted in January 2022, involving 99 representative health data hubs from multiple countries, and 42 responses were obtained in June 2022. Stratification methods were employed to cover different levels of granularity. The survey data was analysed to assess compliance with FAIR and data discovery principles. The study started with a general analysis of survey responses, followed by the creation of specific profiles based on three categories: organization type, function, and level of data aggregation.

Results

The study produced specific best practices for data hubs regarding the adoption of FAIR principles and data discoverability. It also provided an overview of the survey study and specific profiles derived from category analysis, considering different types of data hubs.

Conclusions

The study concluded that a significant number of health data hubs in Europe did not fully comply with FAIR and data discovery principles. However, the study identified specific best practices that can guide new data hubs in adhering to these principles. The study highlighted the importance of aligning health data management mechanisms with FAIR principles to enhance interoperability and reusability in the future HRIC.

背景未来的欧洲健康研究与创新云(HRIC)作为欧洲健康数据空间(EHDS)的重要组成部分,将促进数据的二次利用,并在符合道德和法律的框架内推动健康研究的发展,从而加强患者和公民的信任。目标本研究旨在分析欧洲的健康数据管理机制,以确定其是否符合公平与公正原则(FAIR)和数据发现,为加入 HRIC 生态系统的新数据中心提供最佳实践。研究方法 2022 年 1 月进行了一项调查,涉及多个国家的 99 个具有代表性的健康数据中心,2022 年 6 月获得了 42 份回复。采用了分层方法,以覆盖不同的粒度水平。对调查数据进行了分析,以评估其是否符合 FAIR 和数据发现原则。研究首先对调查回复进行了总体分析,然后根据组织类型、职能和数据聚合程度这三个类别创建了具体的概况。研究还概述了调查研究的情况,以及考虑到不同类型的数据中心,从类别分析中得出的具体概况。结论研究得出结论,欧洲有相当数量的健康数据中心没有完全遵守 FAIR 和数据发现原则。不过,研究发现了一些具体的最佳实践,可以指导新的数据中心遵守这些原则。该研究强调了使健康数据管理机制符合 FAIR 原则的重要性,以提高未来 HRIC 的互操作性和可重用性。
{"title":"Desiderata for discoverability and FAIR adoption of health data hubs","authors":"Celia Alvarez-Romero ,&nbsp;Máximo Bernabeu-Wittel ,&nbsp;Carlos Luis Parra-Calderón ,&nbsp;Silvia Rodríguez Mejías ,&nbsp;Alicia Martínez-García","doi":"10.1016/j.jbi.2024.104700","DOIUrl":"10.1016/j.jbi.2024.104700","url":null,"abstract":"<div><h3>Background</h3><p>The future European Health Research and Innovation Cloud (HRIC), as fundamental part of the European Health Data Space (EHDS), will promote the secondary use of data and the capabilities to push the boundaries of health research within an ethical and legally compliant framework that reinforces the trust of patients and citizens.</p></div><div><h3>Objective</h3><p>This study aimed to analyse health data management mechanisms in Europe to determine their alignment with FAIR principles and data discovery generating best.</p><p>practices for new data hubs joining the HRIC ecosystem. In this line, the compliance of health data hubs with FAIR principles and data discovery were assessed, and a set of best practices for health data hubs was concluded.</p></div><div><h3>Methods</h3><p>A survey was conducted in January 2022, involving 99 representative health data hubs from multiple countries, and 42 responses were obtained in June 2022. Stratification methods were employed to cover different levels of granularity. The survey data was analysed to assess compliance with FAIR and data discovery principles. The study started with a general analysis of survey responses, followed by the creation of specific profiles based on three categories: organization type, function, and level of data aggregation.</p></div><div><h3>Results</h3><p>The study produced specific best practices for data hubs regarding the adoption of FAIR principles and data discoverability. It also provided an overview of the survey study and specific profiles derived from category analysis, considering different types of data hubs.</p></div><div><h3>Conclusions</h3><p>The study concluded that a significant number of health data hubs in Europe did not fully comply with FAIR and data discovery principles. However, the study identified specific best practices that can guide new data hubs in adhering to these principles. The study highlighted the importance of aligning health data management mechanisms with FAIR principles to enhance interoperability and reusability in the future HRIC.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104700"},"PeriodicalIF":4.0,"publicationDate":"2024-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424001187/pdfft?md5=8528674c63bb931855f719c8a92b3d67&pid=1-s2.0-S1532046424001187-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141848605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of hypertension risk based on multiple feature fusion 基于多重特征融合的高血压风险预测。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-22 DOI: 10.1016/j.jbi.2024.104701
Jingdong Yang , Han Wang , Peng Liu , Yuhang Lu , Minghui Yao , Haixia Yan

Objective

In the application of machine learning to the prediction of hypertension, many factors have seriously affected the classification accuracy and generalization performance. We propose a pulse wave classification model based on multi-feature fusion for accuracy prediction of hypertension.

Methods and Materials

We propose an ensemble under-sampling model with dynamic weights to decrease the influence of class imbalance on classification, further to automatically classify of hypertension on inquiry diagnosis. We also build a deep learning model based on hybrid attention mechanism, which transforms pulse waves to feature maps for extraction of in-depth features, so as to automatically classify hypertension on pulse diagnosis. We build the multi-feature fusion model based on dynamic Dempster/Shafer (DS) theory combining inquiry diagnosis and pulse diagnosis to enhance fault tolerance of prediction for multiple classifiers. In addition, this study calculates feature importance ranking of scale features on inquiry diagnosis and temporal and frequency-domain features on pulse diagnosis.

Results

The accuracy, sensitivity, specificity, F1-score and G-mean after 5-fold cross-validation were 94.08%, 93.43%, 96.86%, 93.45% and 95.12%, respectively, based on the hypertensive samples of 409 cases from Longhua Hospital affiliated to Shanghai University of Traditional Chinese Medicine and Hospital of Integrated Traditional Chinese and Western Medicine. We find the key factors influencing hypertensive classification accuracy, so as to assist in the prevention and clinical diagnosis of hypertension.

Conclusion

Compared with the state-of-the-art models, the multi-feature fusion model effectively utilizes the patients’ correlated multimodal features, and has higher classification accuracy and generalization performance.

目的:在应用机器学习预测高血压的过程中,许多因素严重影响了分类的准确性和泛化性能。我们提出了一种基于多特征融合的脉搏波分类模型,用于准确预测高血压:我们提出了一种具有动态权重的集合欠采样模型,以减少类不平衡对分类的影响,并进一步实现了高血压的自动分类查询诊断。我们还建立了基于混合注意力机制的深度学习模型,将脉搏波转化为特征图,以提取深度特征,从而自动对高血压进行脉搏诊断分类。我们基于动态邓普斯特/谢弗(DS)理论,结合问诊和脉诊,建立了多特征融合模型,提高了多个分类器的预测容错能力。此外,本研究还计算了查询诊断的尺度特征和脉搏诊断的时域和频域特征的重要性排序:结果:基于上海中医药大学附属龙华医院和中西医结合医院的 409 例高血压样本,经过 5 倍交叉验证后,准确率、灵敏度、特异性、F1-score 和 G-mean 分别为 94.08%、93.43%、96.86%、93.45% 和 95.12%。我们发现了影响高血压分类准确性的关键因素,从而有助于高血压的预防和临床诊断:结论:与最先进的模型相比,多特征融合模型有效地利用了患者相关的多模态特征,具有更高的分类准确性和泛化性能。
{"title":"Prediction of hypertension risk based on multiple feature fusion","authors":"Jingdong Yang ,&nbsp;Han Wang ,&nbsp;Peng Liu ,&nbsp;Yuhang Lu ,&nbsp;Minghui Yao ,&nbsp;Haixia Yan","doi":"10.1016/j.jbi.2024.104701","DOIUrl":"10.1016/j.jbi.2024.104701","url":null,"abstract":"<div><h3>Objective</h3><p>In the application of machine learning to the prediction of hypertension, many factors have seriously affected the classification accuracy and generalization performance. We propose a pulse wave classification model based on multi-feature fusion for accuracy prediction of hypertension.</p></div><div><h3>Methods and Materials</h3><p>We propose an ensemble under-sampling model with dynamic weights to decrease the influence of class imbalance on classification, further to automatically classify of hypertension on inquiry diagnosis. We also build a deep learning model based on hybrid attention mechanism, which transforms pulse waves to feature maps for extraction of in-depth features, so as to automatically classify hypertension on pulse diagnosis. We build the multi-feature fusion model based on dynamic Dempster/Shafer (DS) theory combining inquiry diagnosis and pulse diagnosis to enhance fault tolerance of prediction for multiple classifiers. In addition, this study calculates feature importance ranking of scale features on inquiry diagnosis and temporal and frequency-domain features on pulse diagnosis.</p></div><div><h3>Results</h3><p>The accuracy, sensitivity, specificity, F1-score and G-mean after 5-fold cross-validation were 94.08%, 93.43%, 96.86%, 93.45% and 95.12%, respectively, based on the hypertensive samples of 409 cases from Longhua Hospital affiliated to Shanghai University of Traditional Chinese Medicine and Hospital of Integrated Traditional Chinese and Western Medicine. We find the key factors influencing hypertensive classification accuracy, so as to assist in the prevention and clinical diagnosis of hypertension.</p></div><div><h3>Conclusion</h3><p>Compared with the state-of-the-art models, the multi-feature fusion model effectively utilizes the patients’ correlated multimodal features, and has higher classification accuracy and generalization performance.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104701"},"PeriodicalIF":4.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141758975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing identification performance of cognitive impairment high-risk based on a semi-supervised learning method 基于半监督学习方法提高认知障碍高风险识别性能。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-19 DOI: 10.1016/j.jbi.2024.104699
Sumei Yao , Yan Zhang , Jing Chen , Quan Lu , Zhiguang Zhao

Background

Cognitive assessment plays a pivotal role in the early detection of cognitive impairment, particularly in the prevention and management of cognitive diseases such as Alzheimer’s and Lewy body dementia. Large-scale screening relies heavily on cognitive assessment scales as primary tools, with some low sensitivity and others expensive. Despite significant progress in machine learning for cognitive function assessment, its application in this particular screening domain remains underexplored, often requiring labor-intensive expert annotations.

Aims

This paper introduces a semi-supervised learning algorithm based on pseudo-label with putback (SS-PP), aiming to enhance model efficiency in predicting the high risk of cognitive impairment (HR-CI) by utilizing the distribution of unlabeled samples.

Data

The study involved 189 labeled samples and 215,078 unlabeled samples from real world. A semi-supervised classification algorithm was designed and evaluated by comparison with supervised methods composed by 14 traditional machine-learning methods and other advanced semi-supervised algorithms.

Results

The optimal SS-PP model, based on GBDT, achieved an AUC of 0.947. Comparative analysis with supervised learning models and semi-supervised methods demonstrated an average AUC improvement of 8% and state-of-art performance, repectively.

Conclusion

This study pioneers the exploration of utilizing limited labeled data for HR-CI predictions and evaluates the benefits of incorporating physical examination data, holding significant implications for the development of cost-effective strategies in relevant healthcare domains.

背景:认知评估在早期发现认知障碍,特别是在预防和管理阿尔茨海默氏症和路易体痴呆症等认知疾病方面发挥着举足轻重的作用。大规模筛查主要依赖认知评估量表作为主要工具,其中有些量表灵敏度低,有些量表价格昂贵。尽管机器学习在认知功能评估方面取得了重大进展,但其在这一特殊筛查领域的应用仍未得到充分探索,通常需要专家注释,耗费大量人力物力。目的:本文介绍了一种基于带回放伪标签(SS-PP)的半监督学习算法,旨在通过利用未标签样本的分布,提高模型预测认知障碍高风险(HR-CI)的效率:研究涉及来自真实世界的 189 个标记样本和 215,078 个未标记样本。研究设计了一种半监督分类算法,并与由 14 种传统机器学习方法和其他先进半监督算法组成的监督方法进行了比较和评估:结果:基于 GBDT 的最佳 SS-PP 模型的 AUC 为 0.947。与监督学习模型和半监督方法的比较分析表明,AUC 平均提高了 8%,性能达到了最先进水平:本研究率先探索了如何利用有限的标记数据进行 HR-CI 预测,并评估了纳入体检数据的益处,对在相关医疗保健领域制定具有成本效益的策略具有重要意义。
{"title":"Enhancing identification performance of cognitive impairment high-risk based on a semi-supervised learning method","authors":"Sumei Yao ,&nbsp;Yan Zhang ,&nbsp;Jing Chen ,&nbsp;Quan Lu ,&nbsp;Zhiguang Zhao","doi":"10.1016/j.jbi.2024.104699","DOIUrl":"10.1016/j.jbi.2024.104699","url":null,"abstract":"<div><h3>Background</h3><p>Cognitive assessment plays a pivotal role in the early detection of cognitive impairment, particularly in the prevention and management of cognitive diseases such as Alzheimer’s and Lewy body dementia. Large-scale screening relies heavily on cognitive assessment scales as primary tools, with some low sensitivity and others expensive. Despite significant progress in machine learning for cognitive function assessment, its application in this particular screening domain remains underexplored, often requiring labor-intensive expert annotations.</p></div><div><h3>Aims</h3><p>This paper introduces a semi-supervised learning algorithm based on pseudo-label with putback (SS-PP), aiming to enhance model efficiency in predicting the high risk of cognitive impairment (HR-CI) by utilizing the distribution of unlabeled samples.</p></div><div><h3>Data</h3><p>The study involved 189 labeled samples and 215,078 unlabeled samples from real world. A semi-supervised classification algorithm was designed and evaluated by comparison with supervised methods composed by 14 traditional machine-learning methods and other advanced semi-supervised algorithms.</p></div><div><h3>Results</h3><p>The optimal SS-PP model, based on GBDT, achieved an AUC of 0.947. Comparative analysis with supervised learning models and semi-supervised methods demonstrated an average AUC improvement of 8% and state-of-art performance, repectively.</p></div><div><h3>Conclusion</h3><p>This study pioneers the exploration of utilizing limited labeled data for HR-CI predictions and evaluates the benefits of incorporating physical examination data, holding significant implications for the development of cost-effective strategies in relevant healthcare domains.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104699"},"PeriodicalIF":4.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141734164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel deep learning model based on transformer and cross modality attention for classification of sleep stages 基于变换器和跨模态注意力的新型深度学习模型,用于睡眠阶段分类。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-18 DOI: 10.1016/j.jbi.2024.104689
Sahar Hassanzadeh Mostafaei , Jafar Tanha , Amir Sharafkhaneh

The classification of sleep stages is crucial for gaining insights into an individual’s sleep patterns and identifying potential health issues. Employing several important physiological channels in different views, each providing a distinct perspective on sleep patterns, can have a great impact on the efficiency of the classification models. In the context of neural networks and deep learning models, transformers are very effective, especially when dealing with time series data, and have shown remarkable compatibility with sequential data analysis as physiological channels. On the other hand, cross-modality attention by integrating information from multiple views of the data enables to capture relationships among different modalities, allowing models to selectively focus on relevant information from each modality. In this paper, we introduce a novel deep-learning model based on transformer encoder-decoder and cross-modal attention for sleep stage classification. The proposed model processes information from various physiological channels with different modalities using the Sleep Heart Health Study Dataset (SHHS) data and leverages transformer encoders for feature extraction and cross-modal attention for effective integration to feed into the transformer decoder. The combination of these elements increased the accuracy of the model up to 91.33% in classifying five classes of sleep stages. Empirical evaluations demonstrated the model’s superior performance compared to standalone approaches and other state-of-the-art techniques, showcasing the potential of combining transformer and cross-modal attention for improved sleep stage classification.

睡眠阶段的分类对于深入了解个人的睡眠模式和识别潜在的健康问题至关重要。在不同的视图中采用多个重要的生理通道,每个通道都能从不同的角度反映睡眠模式,这对分类模型的效率有很大影响。在神经网络和深度学习模型方面,变换器非常有效,尤其是在处理时间序列数据时,并显示出与作为生理通道的序列数据分析的显著兼容性。另一方面,跨模态关注通过整合来自数据多个视图的信息,能够捕捉不同模态之间的关系,使模型有选择性地关注来自每个模态的相关信息。本文介绍了一种基于变压器编码器-解码器和跨模态注意力的新型深度学习模型,用于睡眠阶段分类。所提出的模型利用睡眠心脏健康研究数据集(SHHS)数据处理来自不同模态的各种生理通道的信息,并利用变压器编码器进行特征提取,利用跨模态注意力进行有效整合,以输入变压器解码器。这些元素的结合使模型在对五类睡眠阶段进行分类时的准确率提高到 91.33%。实证评估表明,与独立方法和其他最先进的技术相比,该模型的性能更优越,展示了结合变压器和跨模态注意力改进睡眠阶段分类的潜力。
{"title":"A novel deep learning model based on transformer and cross modality attention for classification of sleep stages","authors":"Sahar Hassanzadeh Mostafaei ,&nbsp;Jafar Tanha ,&nbsp;Amir Sharafkhaneh","doi":"10.1016/j.jbi.2024.104689","DOIUrl":"10.1016/j.jbi.2024.104689","url":null,"abstract":"<div><p>The classification of sleep stages is crucial for gaining insights into an individual’s sleep patterns and identifying potential health issues. Employing several important physiological channels in different views, each providing a distinct perspective on sleep patterns, can have a great impact on the efficiency of the classification models. In the context of neural networks and deep learning models, transformers are very effective, especially when dealing with time series data, and have shown remarkable compatibility with sequential data analysis as physiological channels. On the other hand, cross-modality attention by integrating information from multiple views of the data enables to capture relationships among different modalities, allowing models to selectively focus on relevant information from each modality. In this paper, we introduce a novel deep-learning model based on transformer encoder-decoder and cross-modal attention for sleep stage classification. The proposed model processes information from various physiological channels with different modalities using the Sleep Heart Health Study Dataset (SHHS) data and leverages transformer encoders for feature extraction and cross-modal attention for effective integration to feed into the transformer decoder. The combination of these elements increased the accuracy of the model up to 91.33% in classifying five classes of sleep stages. Empirical evaluations demonstrated the model’s superior performance compared to standalone approaches and other state-of-the-art techniques, showcasing the potential of combining transformer and cross-modal attention for improved sleep stage classification.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104689"},"PeriodicalIF":4.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141727297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A visual approach to facilitating conversations about supportive care options in the context of cognitive impairment 在认知障碍的背景下,促进有关支持性护理选择对话的可视化方法。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-15 DOI: 10.1016/j.jbi.2024.104691
Annie T. Chen , Claire E. Child , Mary Grace Asirot , Kimiko Domoto-Reilly , Anne M. Turner

Background

Persons with cognitive impairment may experience difficulties with language and cognition that interfere with their ability to communicate about health-related decision making.

Objective

We developed a visual elicitation technique to facilitate conversations about preferences concerning potential future supportive care needs and explored the utility of this technique in a qualitative interview study.

Methods

We conducted 15 online interviews with persons with mild cognitive impairment and mild to moderate dementia, using storytelling and a virtual tool designed to facilitate discussion about preferences for supportive care. Interviews were transcribed verbatim and analyzed using an inductive qualitative data analysis method. We report our findings with respect to several main themes. First, we considered participants’ perspectives on supportive care. Next, we examined the utility of the tool for engaging participants in conversation through two themes: cognitive and communicative processes exhibited by participants; and dialogic interactions between the interviewer and the participant.

Results

With respect to participants’ perspectives on supportive care, common themes included considerations relating to informal caregivers such as availability and burden, and the quality of care options such as paid caregivers. Other themes, such as the importance of making decisions as a family, considerations related to facing these challenges on one’s own, and the fluid nature of decision making, also emerged. Common communicative processes included not being responsive to the question and unclear responses. Common cognitive processes included uncertainty and introspection, or self-awareness, of one's cognitive abilities. Last, we examined dialogic interactions between the participant and the interviewer to better understand engagement with the tool. The interviewer was active in using the visualization tool to facilitate the conversation, and participants engaged with the interface to varying degrees. Some participants expressed greater agency and involvement through suggesting images, elaborating on their or the interviewer’s comments, and suggesting icon labels.

Conclusion

This article presents a visual method to engage older adults with cognitive impairment in active dialogue about complex decisions. Though designed for a research setting, the diverse communication and participant-interviewer interaction patterns observed in this study suggest that the tool might be adapted for use in clinical or community settings.

背景:认知障碍患者在语言和认知方面可能会遇到困难,从而影响他们就健康相关决策进行交流的能力:我们开发了一种视觉诱导技术,以促进有关未来潜在支持性护理需求偏好的对话,并在一项定性访谈研究中探索了这种技术的实用性:我们对轻度认知障碍患者和轻度至中度痴呆患者进行了 15 次在线访谈,访谈中使用了讲故事和虚拟工具,旨在促进有关支持性护理偏好的讨论。我们对访谈内容进行了逐字记录,并采用归纳式定性数据分析方法对访谈内容进行了分析。我们就几个主题报告了研究结果。首先,我们考虑了参与者对支持性护理的看法。接下来,我们通过两个主题考察了该工具在吸引参与者参与对话方面的效用:参与者表现出的认知和交流过程;以及访谈者与参与者之间的对话互动:关于参与者对支持性护理的看法,共同的主题包括与非正式护理人员有关的考虑因素,如可用性和负担,以及护理选择的质量,如有偿护理人员。此外,还出现了其他一些主题,如作为一个家庭做出决定的重要性、与独自面对这些挑战有关的考虑因素以及决策的不稳定性。常见的交流过程包括对问题反应迟钝和回答不明确。常见的认知过程包括不确定性和对自己认知能力的反省或自我意识。最后,我们研究了受试者与访谈者之间的对话互动,以更好地了解受试者对工具的使用情况。访谈者积极使用可视化工具促进对话,参与者也在不同程度上参与了界面。一些参与者通过建议图片、阐述自己或访谈者的评论以及建议图标标签,表达了更大的能动性和参与度:本文介绍了一种视觉方法,让有认知障碍的老年人参与到有关复杂决策的积极对话中。虽然该方法是为研究环境而设计的,但本研究中观察到的不同交流方式和参与者与访谈者之间的互动模式表明,该工具可用于临床或社区环境。
{"title":"A visual approach to facilitating conversations about supportive care options in the context of cognitive impairment","authors":"Annie T. Chen ,&nbsp;Claire E. Child ,&nbsp;Mary Grace Asirot ,&nbsp;Kimiko Domoto-Reilly ,&nbsp;Anne M. Turner","doi":"10.1016/j.jbi.2024.104691","DOIUrl":"10.1016/j.jbi.2024.104691","url":null,"abstract":"<div><h3>Background</h3><p>Persons with cognitive impairment may experience difficulties with language and cognition that interfere with their ability to communicate about health-related decision making.</p></div><div><h3>Objective</h3><p>We developed a visual elicitation technique to facilitate conversations about preferences concerning potential future supportive care needs and explored the utility of this technique in a qualitative interview study.</p></div><div><h3>Methods</h3><p>We conducted 15 online interviews with persons with mild cognitive impairment and mild to moderate dementia, using storytelling and a virtual tool designed to facilitate discussion about preferences for supportive care. Interviews were transcribed verbatim and analyzed using an inductive qualitative data analysis method. We report our findings with respect to several main themes. First, we considered participants’ perspectives on supportive care. Next, we examined the utility of the tool for engaging participants in conversation through two themes: cognitive and communicative processes exhibited by participants; and dialogic interactions between the interviewer and the participant.</p></div><div><h3>Results</h3><p>With respect to participants’ perspectives on supportive care, common themes included considerations relating to informal caregivers such as availability and burden, and the quality of care options such as paid caregivers. Other themes, such as the importance of making decisions as a family, considerations related to facing these challenges on one’s own, and the fluid nature of decision making, also emerged. Common communicative processes included not being responsive to the question and unclear responses. Common cognitive processes included uncertainty and introspection, or self-awareness, of one's cognitive abilities. Last, we examined dialogic interactions between the participant and the interviewer to better understand engagement with the tool. The interviewer was active in using the visualization tool to facilitate the conversation, and participants engaged with the interface to varying degrees. Some participants expressed greater agency and involvement through suggesting images, elaborating on their or the interviewer’s comments, and suggesting icon labels.</p></div><div><h3>Conclusion</h3><p>This article presents a visual method to engage older adults with cognitive impairment in active dialogue about complex decisions. Though designed for a research setting, the diverse communication and participant-interviewer interaction patterns observed in this study suggest that the tool might be adapted for use in clinical or community settings.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104691"},"PeriodicalIF":4.0,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141633643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recommendations to promote fairness and inclusion in biomedical AI research and clinical use 生物医学人工智能研究和临床应用中的公平性和包容性:技术和社会视角。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-15 DOI: 10.1016/j.jbi.2024.104693
Ashley C. Griffin , Karen H. Wang , Tiffany I. Leung , Julio C. Facelli

Objective

Understanding and quantifying biases when designing and implementing actionable approaches to increase fairness and inclusion is critical for artificial intelligence (AI) in biomedical applications.

Methods

In this Special Communication, we discuss how bias is introduced at different stages of the development and use of AI applications in biomedical sciences and health care. We describe various AI applications and their implications for fairness and inclusion in sections on 1) Bias in Data Source Landscapes, 2) Algorithmic Fairness, 3) Uncertainty in AI Predictions, 4) Explainable AI for Fairness and Equity, and 5) Sociological/Ethnographic Issues in Data and Results Representation.

Results

We provide recommendations to address biases when developing and using AI in clinical applications.

Conclusion

These recommendations can be applied to informatics research and practice to foster more equitable and inclusive health care systems and research discoveries.

目标在设计和实施提高公平性和包容性的可行方法时,了解和量化偏见对于生物医学应用中的人工智能(AI)至关重要:在本特别通讯中,我们将讨论在生物医学和医疗保健领域开发和使用人工智能应用的不同阶段是如何引入偏见的。我们介绍了各种人工智能应用及其对公平性和包容性的影响,分别涉及:1)数据源景观中的偏见;2)算法公平性;3)人工智能预测中的不确定性;4)可解释人工智能的公平性和公正性;5)数据和结果表示中的社会学/人口学问题:结果:我们提供了在临床应用中开发和使用人工智能时解决偏见问题的建议:这些建议可应用于信息学研究和实践,以促进更公平、更包容的医疗保健系统和研究发现。
{"title":"Recommendations to promote fairness and inclusion in biomedical AI research and clinical use","authors":"Ashley C. Griffin ,&nbsp;Karen H. Wang ,&nbsp;Tiffany I. Leung ,&nbsp;Julio C. Facelli","doi":"10.1016/j.jbi.2024.104693","DOIUrl":"10.1016/j.jbi.2024.104693","url":null,"abstract":"<div><h3>Objective</h3><p>Understanding and quantifying biases when designing and implementing actionable approaches to increase fairness and inclusion is critical for artificial intelligence (AI) in biomedical applications.</p></div><div><h3>Methods</h3><p>In this Special Communication, we discuss how bias is introduced at different stages of the development and use of AI applications in biomedical sciences and health care. We describe various AI applications and their implications for fairness and inclusion in sections on 1) Bias in Data Source Landscapes, 2) Algorithmic Fairness, 3) Uncertainty in AI Predictions, 4) Explainable AI for Fairness and Equity, and 5) Sociological/Ethnographic Issues in Data and Results Representation.</p></div><div><h3>Results</h3><p>We provide recommendations to address biases when developing and using AI in clinical applications.</p></div><div><h3>Conclusion</h3><p>These recommendations can be applied to informatics research and practice to foster more equitable and inclusive health care systems and research discoveries.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104693"},"PeriodicalIF":4.0,"publicationDate":"2024-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141633644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) with Electronic Health Records 利用电子健康记录的半监督双深度学习时空风险预测 (SeDDLeR)。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-14 DOI: 10.1016/j.jbi.2024.104685
Isabelle-Emmanuella Nogues , Jun Wen , Yihan Zhao , Clara-Lea Bonzel , Victor M. Castro , Yucong Lin , Shike Xu , Jue Hou , Tianxi Cai

Background:

Risk prediction plays a crucial role in planning for prevention, monitoring, and treatment. Electronic Health Records (EHRs) offer an expansive repository of temporal medical data encompassing both risk factors and outcome indicators essential for effective risk prediction. However, challenges emerge due to the lack of readily available gold-standard outcomes and the complex effects of various risk factors. Compounding these challenges are the false positives in diagnosis codes, and formidable task of pinpointing the onset timing in annotations.

Objective:

We develop a Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) algorithm based on extensive unlabeled longitudinal Electronic Health Records (EHR) data augmented by a limited set of gold standard labels on the binary status information indicating whether the clinical event of interest occurred during the follow-up period.

Methods:

The SeDDLeR algorithm calculates an individualized risk of developing future clinical events over time using each patient’s baseline EHR features via the following steps: (1) construction of an initial EHR-derived surrogate as a proxy for the onset status; (2) deep learning calibration of the surrogate along gold-standard onset status; and (3) semi-supervised deep learning for risk prediction combining calibrated surrogates and gold-standard onset status. To account for missing onset time and heterogeneous follow-up, we introduce temporal kernel weighting. We devise a Gated Recurrent Units (GRUs) module to capture temporal characteristics. We subsequently assess our proposed SeDDLeR method in simulation studies and apply the method to the Massachusetts General Brigham (MGB) Biobank to predict type 2 diabetes (T2D) risk.

Results:

SeDDLeR outperforms benchmark risk prediction methods, including Semi-parametric Transformation Model (STM) and DeepHit, with consistently best accuracy across experiments. SeDDLeR achieved the best C-statistics ( 0.815, SE 0.023; vs STM +.084, SE 0.030, P-value .004; vs DeepHit +.055, SE 0.027, P-value .024) and best average time-specific AUC (0.778, SE 0.022; vs STM + 0.059, SE 0.039, P-value .067; vs DeepHit + 0.168, SE 0.032, P-value <0.001) in the MGB T2D study.

Conclusion:

SeDDLeR can train robust risk prediction models in both real-world EHR and synthetic datasets with minimal requirements of labeling event times. It holds the potential to be incorporated for future clinical trial recruitment or clinical decision-making.

背景:风险预测在预防、监测和治疗计划中起着至关重要的作用。电子健康记录(EHR)提供了一个庞大的时间医疗数据储存库,其中包含了有效风险预测所必需的风险因素和结果指标。然而,由于缺乏现成的黄金标准结果以及各种风险因素的复杂影响,挑战也随之而来。诊断代码中的假阳性以及在注释中精确定位发病时间的艰巨任务也加剧了这些挑战:我们开发了一种半监督双深度学习时空风险预测(SeDDLeR)算法,该算法基于大量无标记的纵向电子健康记录(EHR)数据,并在二进制状态信息上添加了一组有限的金标准标签,表明相关临床事件是否在随访期间发生:SeDDLeR 算法利用每位患者的基线 EHR 特征,通过以下步骤计算出随着时间推移未来发生临床事件的个体化风险:(1)构建初始 EHR 衍生代用指标,作为发病状态的替代指标;(2)根据黄金标准发病状态对代用指标进行深度学习校准;(3)结合校准后的代用指标和黄金标准发病状态进行半监督深度学习风险预测。为了考虑缺失的发病时间和异质性随访,我们引入了时间核加权。我们设计了一个门控循环单元(GRUs)模块来捕捉时间特征。我们随后在模拟研究中评估了我们提出的 SeDDLeR 方法,并将该方法应用于马萨诸塞州布里格姆将军(MGB)生物库,以预测 2 型糖尿病(T2D)风险:SeDDLeR优于基准风险预测方法,包括半参数转换模型(STM)和DeepHit,在所有实验中始终保持最佳准确性。SeDDLeR 获得了最佳 C 统计量(0.815,SE 0.023;vs STM +.084,SE 0.030,P-value .004;vs DeepHit +.055,SE 0.027,P-value .024)和最佳特定时间平均 AUC(0.778,SE 0.022;vs STM +0.059,SE 0.039,P-value .067;vs DeepHit +0.168,SE 0.032,P-value 结论:SeDDLeR 可以在真实 EHR 和合成数据集中训练稳健的风险预测模型,而且对标注事件时间的要求极低。它有望用于未来的临床试验招募或临床决策。
{"title":"Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) with Electronic Health Records","authors":"Isabelle-Emmanuella Nogues ,&nbsp;Jun Wen ,&nbsp;Yihan Zhao ,&nbsp;Clara-Lea Bonzel ,&nbsp;Victor M. Castro ,&nbsp;Yucong Lin ,&nbsp;Shike Xu ,&nbsp;Jue Hou ,&nbsp;Tianxi Cai","doi":"10.1016/j.jbi.2024.104685","DOIUrl":"10.1016/j.jbi.2024.104685","url":null,"abstract":"<div><h3>Background:</h3><p>Risk prediction plays a crucial role in planning for prevention, monitoring, and treatment. Electronic Health Records (EHRs) offer an expansive repository of temporal medical data encompassing both risk factors and outcome indicators essential for effective risk prediction. However, challenges emerge due to the lack of readily available gold-standard outcomes and the complex effects of various risk factors. Compounding these challenges are the false positives in diagnosis codes, and formidable task of pinpointing the onset timing in annotations.</p></div><div><h3>Objective:</h3><p>We develop a <strong>Se</strong>mi-supervised <strong>D</strong>ouble <strong>D</strong>eep <strong>Le</strong>arning Temporal <strong>R</strong>isk Prediction (SeDDLeR) algorithm based on extensive unlabeled longitudinal Electronic Health Records (EHR) data augmented by a limited set of gold standard labels on the binary status information indicating whether the clinical event of interest occurred during the follow-up period.</p></div><div><h3>Methods:</h3><p>The SeDDLeR algorithm calculates an individualized risk of developing future clinical events over time using each patient’s baseline EHR features via the following steps: (1) construction of an initial EHR-derived surrogate as a proxy for the onset status; (2) deep learning calibration of the surrogate along gold-standard onset status; and (3) semi-supervised deep learning for risk prediction combining calibrated surrogates and gold-standard onset status. To account for missing onset time and heterogeneous follow-up, we introduce temporal kernel weighting. We devise a Gated Recurrent Units (GRUs) module to capture temporal characteristics. We subsequently assess our proposed SeDDLeR method in simulation studies and apply the method to the Massachusetts General Brigham (MGB) Biobank to predict type 2 diabetes (T2D) risk.</p></div><div><h3>Results:</h3><p>SeDDLeR outperforms benchmark risk prediction methods, including Semi-parametric Transformation Model (STM) and DeepHit, with consistently best accuracy across experiments. SeDDLeR achieved the best C-statistics ( 0.815, SE 0.023; vs STM +.084, SE 0.030, <span><math><mi>P</mi></math></span>-value .004; vs DeepHit +.055, SE 0.027, <span><math><mi>P</mi></math></span>-value .024) and best average time-specific AUC (0.778, SE 0.022; vs STM + 0.059, SE 0.039, <span><math><mi>P</mi></math></span>-value .067; vs DeepHit + 0.168, SE 0.032, <span><math><mi>P</mi></math></span>-value <span><math><mo>&lt;</mo></math></span>0.001) in the MGB T2D study.</p></div><div><h3>Conclusion:</h3><p>SeDDLeR can train robust risk prediction models in both real-world EHR and synthetic datasets with minimal requirements of labeling event times. It holds the potential to be incorporated for future clinical trial recruitment or clinical decision-making.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104685"},"PeriodicalIF":4.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141616530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data 利用容易出错的算法衍生表型:加强电子病历数据中风险因素的关联研究。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-14 DOI: 10.1016/j.jbi.2024.104690
Yiwen Lu , Jiayi Tong , Jessica Chubak , Thomas Lumley , Rebecca A Hubbard , Hua Xu , Yong Chen

Objectives

It has become increasingly common for multiple computable phenotypes from electronic health records (EHR) to be developed for a given phenotype. However, EHR-based association studies often focus on a single phenotype. In this paper, we develop a method aiming to simultaneously make use of multiple EHR-derived phenotypes for reduction of bias due to phenotyping error and improved efficiency of phenotype/exposure associations.

Materials and Methods

The proposed method combines multiple algorithm-derived phenotypes with a small set of validated outcomes to reduce bias and improve estimation accuracy and efficiency. The performance of our method was evaluated through simulation studies and real-world application to an analysis of colon cancer recurrence using EHR data from Kaiser Permanente Washington.

Results

In settings where there was no single surrogate performing uniformly better than all others in terms of both sensitivity and specificity, our method achieved substantial bias reduction compared to using a single algorithm-derived phenotype. Our method also led to higher estimation efficiency by up to 30% compared to an estimator that used only one algorithm-derived phenotype.

Discussion

Simulation studies and application to real-world data demonstrated the effectiveness of our method in integrating multiple phenotypes, thereby enhancing bias reduction, statistical accuracy and efficiency.

Conclusions

Our method combines information across multiple surrogates using a statistically efficient seemingly unrelated regression framework. Our method provides a robust alternative to single-surrogate-based bias correction, especially in contexts lacking information on which surrogate is superior.

目的:从电子健康记录(EHR)中为特定表型开发多种可计算的表型已变得越来越普遍。然而,基于电子病历的关联研究通常只关注单一表型。在本文中,我们开发了一种方法,旨在同时利用多个 EHR 衍生表型,以减少表型错误造成的偏差,并提高表型/暴露关联的效率:所提出的方法将多种算法得出的表型与一小部分经过验证的结果相结合,以减少偏差并提高估计的准确性和效率。通过模拟研究和实际应用,利用华盛顿州凯撒医疗保健公司的电子病历数据对结肠癌复发进行分析,评估了我们方法的性能:结果:在灵敏度和特异性方面,没有任何一种替代物能比其他替代物表现得更好,与使用单一算法衍生的表型相比,我们的方法大大减少了偏差。与仅使用一种算法衍生表型的估算器相比,我们的方法还能提高估算效率达 30%:讨论:模拟研究和对真实世界数据的应用证明了我们的方法在整合多种表型方面的有效性,从而提高了偏差减少率、统计准确性和效率:结论:我们的方法利用统计上高效的看似无关回归框架整合了多个代用指标的信息。我们的方法为基于单一代用指标的偏差校正提供了一种稳健的替代方法,尤其是在缺乏代用指标优劣信息的情况下。
{"title":"Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data","authors":"Yiwen Lu ,&nbsp;Jiayi Tong ,&nbsp;Jessica Chubak ,&nbsp;Thomas Lumley ,&nbsp;Rebecca A Hubbard ,&nbsp;Hua Xu ,&nbsp;Yong Chen","doi":"10.1016/j.jbi.2024.104690","DOIUrl":"10.1016/j.jbi.2024.104690","url":null,"abstract":"<div><h3>Objectives</h3><p>It has become increasingly common for multiple computable phenotypes from electronic health records (EHR) to be developed for a given phenotype. However, EHR-based association studies often focus on a single phenotype. In this paper, we develop a method aiming to simultaneously make use of multiple EHR-derived phenotypes for reduction of bias due to phenotyping error and improved efficiency of phenotype/exposure associations.</p></div><div><h3>Materials and Methods</h3><p>The proposed method combines multiple algorithm-derived phenotypes with a small set of validated outcomes to reduce bias and improve estimation accuracy and efficiency. The performance of our method was evaluated through simulation studies and real-world application to an analysis of colon cancer recurrence using EHR data from Kaiser Permanente Washington.</p></div><div><h3>Results</h3><p>In settings where there was no single surrogate performing uniformly better than all others in terms of both sensitivity and specificity, our method achieved substantial bias reduction compared to using a single algorithm-derived phenotype. Our method also led to higher estimation efficiency by up to 30% compared to an estimator that used only one algorithm-derived phenotype.</p></div><div><h3>Discussion</h3><p>Simulation studies and application to real-world data demonstrated the effectiveness of our method in integrating multiple phenotypes, thereby enhancing bias reduction, statistical accuracy and efficiency.</p></div><div><h3>Conclusions</h3><p>Our method combines information across multiple surrogates using a statistically efficient seemingly unrelated regression framework. Our method provides a robust alternative to single-surrogate-based bias correction, especially in contexts lacking information on which surrogate is superior.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104690"},"PeriodicalIF":4.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141616529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating gender bias in ML-based clinical risk prediction models: A study on multiple use cases at different hospitals 评估基于 ML 的临床风险预测模型中的性别偏差:对不同医院多个使用案例的研究。
IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-07-14 DOI: 10.1016/j.jbi.2024.104692
Patricia Cabanillas Silva , Hong Sun , Pablo Rodriguez-Brazzarola , Mohamed Rezk , Xianchao Zhang , Janis Fliegenschmidt , Nikolai Hulde , Vera von Dossow , Laurent Meesseman , Kristof Depraetere , Ralph Szymanowsky , Jörg Stieg , Fried-Michael Dahlweid

Background

An inherent difference exists between male and female bodies, the historical under-representation of females in clinical trials widened this gap in existing healthcare data. The fairness of clinical decision-support tools is at risk when developed based on biased data. This paper aims to quantitatively assess the gender bias in risk prediction models. We aim to generalize our findings by performing this investigation on multiple use cases at different hospitals.

Methods

First, we conduct a thorough analysis of the source data to find gender-based disparities. Secondly, we assess the model performance on different gender groups at different hospitals and on different use cases. Performance evaluation is quantified using the area under the receiver-operating characteristic curve (AUROC). Lastly, we investigate the clinical implications of these biases by analyzing the underdiagnosis and overdiagnosis rate, and the decision curve analysis (DCA). We also investigate the influence of model calibration on mitigating gender-related disparities in decision-making processes.

Results

Our data analysis reveals notable variations in incidence rates, AUROC, and over-diagnosis rates across different genders, hospitals and clinical use cases. However, it is also observed the underdiagnosis rate is consistently higher in the female population. In general, the female population exhibits lower incidence rates and the models perform worse when applied to this group. Furthermore, the decision curve analysis demonstrates there is no statistically significant difference between the model’s clinical utility across gender groups within the interested range of thresholds.

Conclusion

The presence of gender bias within risk prediction models varies across different clinical use cases and healthcare institutions. Although inherent difference is observed between male and female populations at the data source level, this variance does not affect the parity of clinical utility. In conclusion, the evaluations conducted in this study highlight the significance of continuous monitoring of gender-based disparities in various perspectives for clinical risk prediction models.

背景:男性和女性的身体存在固有的差异,女性在临床试验中的代表性不足的历史,扩大了现有医疗数据中的这一差距。如果基于有偏见的数据开发临床决策支持工具,其公平性就会受到威胁。本文旨在定量评估风险预测模型中的性别偏差。我们的目标是通过对不同医院的多个使用案例进行调查来推广我们的发现:首先,我们对源数据进行全面分析,以发现基于性别的差异。其次,我们评估了模型在不同医院不同性别群体和不同使用案例中的性能。性能评估采用受体运行特征曲线下面积(AUROC)进行量化。最后,我们通过分析诊断不足率和诊断过度率以及决策曲线分析(DCA)来研究这些偏差的临床影响。我们还研究了模型校准对减轻决策过程中与性别有关的差异的影响:我们的数据分析揭示了不同性别、医院和临床病例在发病率、AUROC 和过度诊断率方面的显著差异。不过,我们也发现女性群体的诊断不足率一直较高。一般来说,女性群体的发病率较低,模型在应用于这一群体时表现较差。此外,决策曲线分析表明,在感兴趣的阈值范围内,不同性别群体的模型临床实用性在统计学上没有显著差异:结论:在不同的临床应用案例和医疗机构中,风险预测模型中存在的性别偏差各不相同。虽然在数据源层面观察到男性和女性人群之间存在固有差异,但这种差异并不影响临床效用的均等性。总之,本研究中进行的评估强调了从不同角度持续监测性别差异对临床风险预测模型的重要意义。
{"title":"Evaluating gender bias in ML-based clinical risk prediction models: A study on multiple use cases at different hospitals","authors":"Patricia Cabanillas Silva ,&nbsp;Hong Sun ,&nbsp;Pablo Rodriguez-Brazzarola ,&nbsp;Mohamed Rezk ,&nbsp;Xianchao Zhang ,&nbsp;Janis Fliegenschmidt ,&nbsp;Nikolai Hulde ,&nbsp;Vera von Dossow ,&nbsp;Laurent Meesseman ,&nbsp;Kristof Depraetere ,&nbsp;Ralph Szymanowsky ,&nbsp;Jörg Stieg ,&nbsp;Fried-Michael Dahlweid","doi":"10.1016/j.jbi.2024.104692","DOIUrl":"10.1016/j.jbi.2024.104692","url":null,"abstract":"<div><h3>Background</h3><p>An inherent difference exists between male and female bodies, the historical under-representation of females in clinical trials widened this gap in existing healthcare data. The fairness of clinical decision-support tools is at risk when developed based on biased data. This paper aims to quantitatively assess the gender bias in risk prediction models. We aim to generalize our findings by performing this investigation on multiple use cases at different hospitals.</p></div><div><h3>Methods</h3><p>First, we conduct a thorough analysis of the source data to find gender-based disparities. Secondly, we assess the model performance on different gender groups at different hospitals and on different use cases. Performance evaluation is quantified using the area under the receiver-operating characteristic curve (AUROC). Lastly, we investigate the clinical implications of these biases by analyzing the underdiagnosis and overdiagnosis rate, and the decision curve analysis (DCA). We also investigate the influence of model calibration on mitigating gender-related disparities in decision-making processes.</p></div><div><h3>Results</h3><p>Our data analysis reveals notable variations in incidence rates, AUROC, and over-diagnosis rates across different genders, hospitals and clinical use cases. However, it is also observed the underdiagnosis rate is consistently higher in the female population. In general, the female population exhibits lower incidence rates and the models perform worse when applied to this group. Furthermore, the decision curve analysis demonstrates there is no statistically significant difference between the model’s clinical utility across gender groups within the interested range of thresholds.</p></div><div><h3>Conclusion</h3><p>The presence of gender bias within risk prediction models varies across different clinical use cases and healthcare institutions. Although inherent difference is observed between male and female populations at the data source level, this variance does not affect the parity of clinical utility. In conclusion, the evaluations conducted in this study highlight the significance of continuous monitoring of gender-based disparities in various perspectives for clinical risk prediction models.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"157 ","pages":"Article 104692"},"PeriodicalIF":4.0,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141620028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Biomedical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1