首页 > 最新文献

Journal of Biomedical Informatics最新文献

英文 中文
Evaluating accuracy and fairness of clinical decision support algorithms when health care resources are limited 在医疗资源有限的情况下,评估临床决策支持算法的准确性和公平性。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-06-06 DOI: 10.1016/j.jbi.2024.104664
Esther L. Meerwijk , Duncan C. McElfresh , Susana Martins , Suzanne R. Tamang

Objective

Guidance on how to evaluate accuracy and algorithmic fairness across subgroups is missing for clinical models that flag patients for an intervention but when health care resources to administer that intervention are limited. We aimed to propose a framework of metrics that would fit this specific use case.

Methods

We evaluated the following metrics and applied them to a Veterans Health Administration clinical model that flags patients for intervention who are at risk of overdose or a suicidal event among outpatients who were prescribed opioids (N = 405,817): Receiver – Operating Characteristic and area under the curve, precision – recall curve, calibration – reliability curve, false positive rate, false negative rate, and false omission rate. In addition, we developed a new approach to visualize false positives and false negatives that we named ‘per true positive bars.’ We demonstrate the utility of these metrics to our use case for three cohorts of patients at the highest risk (top 0.5 %, 1.0 %, and 5.0 %) by evaluating algorithmic fairness across the following age groups: <=30, 31–50, 51–65, and >65 years old.

Results

Metrics that allowed us to assess group differences more clearly were the false positive rate, false negative rate, false omission rate, and the new ‘per true positive bars’. Metrics with limited utility to our use case were the Receiver – Operating Characteristic and area under the curve, the calibration – reliability curve, and the precision – recall curve.

Conclusion

There is no “one size fits all” approach to model performance monitoring and bias analysis. Our work informs future researchers and clinicians who seek to evaluate accuracy and fairness of predictive models that identify patients to intervene on in the context of limited health care resources. In terms of ease of interpretation and utility for our use case, the new ‘per true positive bars’ may be the most intuitive to a range of stakeholders and facilitates choosing a threshold that allows weighing false positives against false negatives, which is especially important when predicting severe adverse events.

目的:临床模型会对患者进行干预标记,但实施干预的医疗资源有限,如何评估不同亚组的准确性和算法公平性尚缺乏指导。我们的目标是提出一个适合这种特殊情况的指标框架:我们对以下指标进行了评估,并将其应用于退伍军人健康管理局的临床模型,该模型可对开具阿片类药物处方的门诊患者中存在用药过量或自杀风险的患者(N = 405,817)进行标记干预:接收者工作特征曲线和曲线下面积、精确度-召回曲线、校准-可靠性曲线、假阳性率、假阴性率和假遗漏率。此外,我们还开发了一种可视化假阳性和假阴性的新方法,并将其命名为 "每真阳性条形图"。我们通过评估以下年龄组(65 岁)的算法公平性,展示了这些指标在我们的使用案例中对三组最高风险患者(前 0.5%、1.0% 和 5.0%)的实用性:假阳性率、假阴性率、假遗漏率和新的 "每真阳性条数 "等指标能让我们更清楚地评估组间差异。对我们的应用案例有用性有限的指标是接收者-操作特征曲线和曲线下面积、校准-可靠性曲线以及精确度-召回曲线 结论:模型性能监测和偏差分析没有 "放之四海而皆准 "的方法。我们的工作为未来的研究人员和临床医生提供了参考,他们需要评估预测模型的准确性和公平性,以便在医疗资源有限的情况下识别需要干预的患者。就我们的使用案例而言,新的 "每真阳性条数 "可能是最直观的解释,也是最有用的解释,它有助于选择一个阈值,以权衡假阳性和假阴性,这在预测严重不良事件时尤为重要。
{"title":"Evaluating accuracy and fairness of clinical decision support algorithms when health care resources are limited","authors":"Esther L. Meerwijk ,&nbsp;Duncan C. McElfresh ,&nbsp;Susana Martins ,&nbsp;Suzanne R. Tamang","doi":"10.1016/j.jbi.2024.104664","DOIUrl":"10.1016/j.jbi.2024.104664","url":null,"abstract":"<div><h3>Objective</h3><p>Guidance on how to evaluate accuracy and algorithmic fairness across subgroups is missing for clinical models that flag patients for an intervention but when health care resources to administer that intervention are limited. We aimed to propose a framework of metrics that would fit this specific use case.</p></div><div><h3>Methods</h3><p>We evaluated the following metrics and applied them to a Veterans Health Administration clinical model that flags patients for intervention who are at risk of overdose or a suicidal event among outpatients who were prescribed opioids (N = 405,817): Receiver – Operating Characteristic and area under the curve, precision – recall curve, calibration – reliability curve, false positive rate, false negative rate, and false omission rate. In addition, we developed a new approach to visualize false positives and false negatives that we named ‘per true positive bars.’ We demonstrate the utility of these metrics to our use case for three cohorts of patients at the highest risk (top 0.5 %, 1.0 %, and 5.0 %) by evaluating algorithmic fairness across the following age groups: &lt;=30, 31–50, 51–65, and &gt;65 years old.</p></div><div><h3>Results</h3><p>Metrics that allowed us to assess group differences more clearly were the false positive rate, false negative rate, false omission rate, and the new ‘per true positive bars’. Metrics with limited utility to our use case were the Receiver – Operating Characteristic and area under the curve, the calibration – reliability curve, and the precision – recall curve.</p></div><div><h3>Conclusion</h3><p>There is no “one size fits all” approach to model performance monitoring and bias analysis. Our work informs future researchers and clinicians who seek to evaluate accuracy and fairness of predictive models that identify patients to intervene on in the context of limited health care resources. In terms of ease of interpretation and utility for our use case, the new ‘per true positive bars’ may be the most intuitive to a range of stakeholders and facilitates choosing a threshold that allows weighing false positives against false negatives, which is especially important when predicting severe adverse events.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"156 ","pages":"Article 104664"},"PeriodicalIF":4.5,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141293450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding random resampling techniques for class imbalance correction and their consequences on calibration and discrimination of clinical risk prediction models 了解用于类不平衡校正的随机再采样技术及其对临床风险预测模型校准和判别的影响。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-06-06 DOI: 10.1016/j.jbi.2024.104666
Marco Piccininni , Maximilian Wechsung , Ben Van Calster , Jessica L. Rohmann , Stefan Konigorski , Maarten van Smeden

Objective

Class imbalance is sometimes considered a problem when developing clinical prediction models and assessing their performance. To address it, correction strategies involving manipulations of the training dataset, such as random undersampling or oversampling, are frequently used. The aim of this article is to illustrate the consequences of these class imbalance correction strategies on clinical prediction models’ internal validity in terms of calibration and discrimination performances.

Methods

We used both heuristic intuition and formal mathematical reasoning to characterize the relations between conditional probabilities of interest and probabilities targeted when using random undersampling or oversampling. We propose a plug-in estimator that represents a natural correction for predictions obtained from models that have been trained on artificially balanced datasets (“naïve” models). We conducted a Monte Carlo simulation with two different data generation processes and present a real-world example using data from the International Stroke Trial database to empirically demonstrate the consequences of applying random resampling techniques for class imbalance correction on calibration and discrimination (in terms of Area Under the ROC, AUC) for logistic regression and tree-based prediction models.

Results

Across our simulations and in the real-world example, calibration of the naïve models was very poor. The models using the plug-in estimator generally outperformed the models relying on class imbalance correction in terms of calibration while achieving the same discrimination performance.

Conclusion

Random resampling techniques for class imbalance correction do not generally improve discrimination performance (i.e., AUC), and their use is hard to justify when aiming at providing calibrated predictions. Improper use of such class imbalance correction techniques can lead to suboptimal data usage and less valid risk prediction models.

目的:在开发临床预测模型和评估其性能时,类不平衡有时被认为是一个问题。为了解决这个问题,人们经常使用一些修正策略来处理训练数据集,如随机欠采样或超采样。本文旨在说明这些类不平衡校正策略对临床预测模型在校准和判别性能方面的内部有效性的影响:方法:我们利用启发式直觉和正规数学推理来描述在使用随机欠采样或超采样时,相关条件概率与目标概率之间的关系。我们提出了一种插件估计器,它代表了对在人为平衡数据集("天真 "模型)上训练过的模型预测结果的自然修正。我们用两种不同的数据生成过程进行了蒙特卡罗模拟,并使用国际脑卒中试验数据库的数据提供了一个实际例子,以实证证明应用随机再采样技术进行类不平衡校正对逻辑回归和基于树的预测模型的校正和判别(以 ROC 下面积 AUC 表示)的影响:在我们的模拟和实际例子中,天真模型的校准效果非常差。在校准方面,使用插件估计器的模型普遍优于依赖类不平衡校正的模型,同时获得相同的判别性能:结论:用于类不平衡校正的随机再采样技术一般不会提高判别性能(即 AUC),而且在提供校准预测时,很难证明使用这种技术是合理的。不恰当地使用这种类不平衡校正技术会导致数据使用效果不理想,风险预测模型的有效性也会降低。
{"title":"Understanding random resampling techniques for class imbalance correction and their consequences on calibration and discrimination of clinical risk prediction models","authors":"Marco Piccininni ,&nbsp;Maximilian Wechsung ,&nbsp;Ben Van Calster ,&nbsp;Jessica L. Rohmann ,&nbsp;Stefan Konigorski ,&nbsp;Maarten van Smeden","doi":"10.1016/j.jbi.2024.104666","DOIUrl":"10.1016/j.jbi.2024.104666","url":null,"abstract":"<div><h3>Objective</h3><p>Class imbalance is sometimes considered a problem when developing clinical prediction models and assessing their performance. To address it, correction strategies involving manipulations of the training dataset, such as random undersampling or oversampling, are frequently used. The aim of this article is to illustrate the consequences of these class imbalance correction strategies on clinical prediction models’ internal validity in terms of calibration and discrimination performances.</p></div><div><h3>Methods</h3><p>We used both heuristic intuition and formal mathematical reasoning to characterize the relations between conditional probabilities of interest and probabilities targeted when using random undersampling or oversampling. We propose a plug-in estimator that represents a natural correction for predictions obtained from models that have been trained on artificially balanced datasets (“naïve” models). We conducted a Monte Carlo simulation with two different data generation processes and present a real-world example using data from the International Stroke Trial database to empirically demonstrate the consequences of applying random resampling techniques for class imbalance correction on calibration and discrimination (in terms of Area Under the ROC, AUC) for logistic regression and tree-based prediction models.</p></div><div><h3>Results</h3><p>Across our simulations and in the real-world example, calibration of the naïve models was very poor. The models using the plug-in estimator generally outperformed the models relying on class imbalance correction in terms of calibration while achieving the same discrimination performance.</p></div><div><h3>Conclusion</h3><p>Random resampling techniques for class imbalance correction do not generally improve discrimination performance (i.e., AUC), and their use is hard to justify when aiming at providing calibrated predictions. Improper use of such class imbalance correction techniques can lead to suboptimal data usage and less valid risk prediction models.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104666"},"PeriodicalIF":4.5,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141288117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards the automatic calculation of the EQUAL Candida Score: Extraction of CVC-related information from EMRs of critically ill patients with candidemia in Intensive Care Units 实现 EQUAL 念珠菌评分的自动计算:从重症监护病房念珠菌血症重症患者的电子病历中提取与 CVC 相关的信息。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-06-05 DOI: 10.1016/j.jbi.2024.104667
Sara Mora , Daniele Roberto Giacobbe , Claudia Bartalucci , Giulia Viglietti , Malgorzata Mikulska , Antonio Vena , Lorenzo Ball , Chiara Robba , Alice Cappello , Denise Battaglini , Iole Brunetti , Paolo Pelosi , Matteo Bassetti , Mauro Giacomini

Objectives

Candidemia is the most frequent invasive fungal disease and the fourth most frequent bloodstream infection in hospitalized patients. Its optimal management is crucial for improving patients’ survival. The quality of candidemia management can be assessed with the EQUAL Candida Score. The objective of this work is to support its automatic calculation by extracting central venous catheter-related information from Italian text in clinical notes of electronic medical records.

Materials and methods

The sample includes 4,787 clinical notes of 108 patients hospitalized between January 2018 to December 2020 in the Intensive Care Units of the IRCCS San Martino Polyclinic Hospital in Genoa (Italy). The devised pipeline exploits natural language processing (NLP) to produce numerical representations of clinical notes used as input of machine learning (ML) algorithms to identify CVC presence and removal. It compares the performances of (i) rule-based method, (ii) count-based method together with a ML algorithm, and (iii) a transformers-based model.

Results

Results, obtained with three different approaches, were evaluated in terms of weighted F1 Score. The random forest classifier showed the higher performance in both tasks reaching 82.35%.

Conclusion

The present work constitutes a first step towards the automatic calculation of the EQUAL Candida Score from unstructured daily collected data by combining ML and NLP methods. The automatic calculation of the EQUAL Candida Score could provide crucial real-time feedback on the quality of candidemia management, aimed at further improving patients’ health.

目的:念珠菌血症是最常见的侵袭性真菌疾病,也是住院病人中第四大血流感染。优化治疗对于提高患者生存率至关重要。念珠菌血症管理的质量可以用 EQUAL 念珠菌评分来评估。这项工作的目的是通过从电子病历临床笔记的意大利语文本中提取中心静脉导管相关信息,支持其自动计算:样本包括 2018 年 1 月至 2020 年 12 月期间在热那亚大学医院(意大利)重症监护室住院的 108 名患者的 4787 份临床记录。所设计的管道利用自然语言处理(NLP)生成临床笔记的数字表示,作为机器学习(ML)算法的输入,以识别CVC的存在和移除。它比较了(i)基于规则的方法、(ii)基于计数的方法和机器学习算法以及(iii)基于转换器的模型的性能:结果:采用三种不同方法得出的结果按加权 F1 分数进行了评估。随机森林分类器在两项任务中的表现都较好,达到了 82.35%:本研究结合了 ML 和 NLP 方法,迈出了从日常收集的非结构化数据中自动计算 EQUAL 念珠菌评分的第一步。EQUAL 念珠菌评分的自动计算可为念珠菌病管理质量提供重要的实时反馈,从而进一步改善患者的健康状况。
{"title":"Towards the automatic calculation of the EQUAL Candida Score: Extraction of CVC-related information from EMRs of critically ill patients with candidemia in Intensive Care Units","authors":"Sara Mora ,&nbsp;Daniele Roberto Giacobbe ,&nbsp;Claudia Bartalucci ,&nbsp;Giulia Viglietti ,&nbsp;Malgorzata Mikulska ,&nbsp;Antonio Vena ,&nbsp;Lorenzo Ball ,&nbsp;Chiara Robba ,&nbsp;Alice Cappello ,&nbsp;Denise Battaglini ,&nbsp;Iole Brunetti ,&nbsp;Paolo Pelosi ,&nbsp;Matteo Bassetti ,&nbsp;Mauro Giacomini","doi":"10.1016/j.jbi.2024.104667","DOIUrl":"10.1016/j.jbi.2024.104667","url":null,"abstract":"<div><h3>Objectives</h3><p>Candidemia is the most frequent invasive fungal disease and the fourth most frequent bloodstream infection in hospitalized patients. Its optimal management is crucial for improving patients’ survival. The quality of candidemia management can be assessed with the EQUAL Candida Score. The objective of this work is to support its automatic calculation by extracting central venous catheter-related information from Italian text in clinical notes of electronic medical records.</p></div><div><h3>Materials and methods</h3><p>The sample includes 4,787 clinical notes of 108 patients hospitalized between January 2018 to December 2020 in the Intensive Care Units of the IRCCS San Martino Polyclinic Hospital in Genoa (Italy). The devised pipeline exploits natural language processing (NLP) to produce numerical representations of clinical notes used as input of machine learning (ML) algorithms to identify CVC presence and removal. It compares the performances of (i) rule-based method, (ii) count-based method together with a ML algorithm, and (iii) a transformers-based model.</p></div><div><h3>Results</h3><p>Results, obtained with three different approaches, were evaluated in terms of weighted F1 Score. The random forest classifier showed the higher performance in both tasks reaching 82.35%.</p></div><div><h3>Conclusion</h3><p>The present work constitutes a first step towards the automatic calculation of the EQUAL Candida Score from unstructured daily collected data by combining ML and NLP methods. The automatic calculation of the EQUAL Candida Score could provide crucial real-time feedback on the quality of candidemia management, aimed at further improving patients’ health.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"156 ","pages":"Article 104667"},"PeriodicalIF":4.5,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000856/pdfft?md5=a30b244f7e0105221d15b41ed47d5c32&pid=1-s2.0-S1532046424000856-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141288116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Promoting equity in clinical research: The role of social determinants of health 促进临床研究的公平性:健康的社会决定因素的作用。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-06-04 DOI: 10.1016/j.jbi.2024.104663
Betina Idnay , Yilu Fang , Edward Stanley , Brenda Ruotolo , Wendy K. Chung , Karen Marder , Chunhua Weng

Objective

This study aims to investigate the association between social determinants of health (SDoH) and clinical research recruitment outcomes and recommends evidence-based strategies to enhance equity.

Materials and Methods

Data were collected from the internal clinical study manager database, clinical data warehouse, and clinical research registry. Study characteristics (e.g., study phase) and sociodemographic information were extracted. Median neighborhood income, distance from the study location, and Area Deprivation Index (ADI) were calculated. Mixed effect generalized regression was used for clustering effects and false discovery rate adjustment for multiple testing. A stratified analysis was performed to examine the impact in distinct medical departments.

Results

The study sample consisted of 3,962 individuals, with a mean age of 61.5 years, 53.6 % male, 54.2 % White, and 49.1 % non-Hispanic or Latino. Study characteristics revealed a variety of protocols across different departments, with cardiology having the highest percentage of participants (46.4 %). Industry funding was the most common (74.5 %), and digital advertising and personal outreach were the main recruitment methods (58.9 % and 90.8 %).

Discussion

The analysis demonstrated significant associations between participant characteristics and research participation, including biological sex, age, ethnicity, and language. The stratified analysis revealed other significant associations for recruitment strategies. SDoH is crucial to clinical research recruitment, and this study presents evidence-based solutions for equity and inclusivity. Researchers can tailor recruitment strategies to overcome barriers and increase participant diversity by identifying participant characteristics and research involvement status.

Conclusion

The findings highlight the relevance of clinical research inequities and equitable representation of historically underrepresented populations. We need to improve recruitment strategies to promote diversity and inclusivity in research.

目的:本研究旨在调查健康的社会决定因素(SDoH)与临床研究招聘结果之间的关系,并提出基于证据的策略建议,以提高公平性:本研究旨在调查健康的社会决定因素(SDoH)与临床研究招聘结果之间的关联,并推荐基于证据的策略以提高公平性:数据收集自内部临床研究经理数据库、临床数据仓库和临床研究注册表。提取了研究特征(如研究阶段)和社会人口信息。计算了社区收入中位数、与研究地点的距离和地区贫困指数(ADI)。混合效应广义回归用于聚类效应和多重检验的误发现率调整。对不同医疗部门的影响进行了分层分析:研究样本包括 3962 人,平均年龄 61.5 岁,53.6% 为男性,54.2% 为白人,49.1% 为非西班牙裔或拉丁裔。研究特征显示,不同科室的研究方案各不相同,其中心脏科的参与者比例最高(46.4%)。行业资助最常见(74.5%),数字广告和个人推广是主要的招募方法(58.9% 和 90.8%):讨论:分析表明,参与者的特征与参与研究之间存在重要关联,包括生理性别、年龄、种族和语言。分层分析表明,招募策略与其他因素也有重要关联。SDoH 对临床研究招募至关重要,本研究提出了基于证据的公平性和包容性解决方案。研究人员可以通过识别参与者的特征和研究参与状况来调整招募策略,以克服障碍并增加参与者的多样性:结论:研究结果强调了临床研究不平等和历史上代表性不足人群的公平代表性的相关性。我们需要改进招募策略,促进研究的多样性和包容性。
{"title":"Promoting equity in clinical research: The role of social determinants of health","authors":"Betina Idnay ,&nbsp;Yilu Fang ,&nbsp;Edward Stanley ,&nbsp;Brenda Ruotolo ,&nbsp;Wendy K. Chung ,&nbsp;Karen Marder ,&nbsp;Chunhua Weng","doi":"10.1016/j.jbi.2024.104663","DOIUrl":"10.1016/j.jbi.2024.104663","url":null,"abstract":"<div><h3>Objective</h3><p>This study aims to investigate the association between social determinants of health (SDoH) and clinical research recruitment outcomes and recommends evidence-based strategies to enhance equity.</p></div><div><h3>Materials and Methods</h3><p>Data were collected from the internal clinical study manager database, clinical data warehouse, and clinical research registry. Study characteristics (e.g., study phase) and sociodemographic information were extracted. Median neighborhood income, distance from the study location, and Area Deprivation Index (ADI) were calculated. Mixed effect generalized regression was used for clustering effects and false discovery rate adjustment for multiple testing. A stratified analysis was performed to examine the impact in distinct medical departments.</p></div><div><h3>Results</h3><p>The study sample consisted of 3,962 individuals, with a mean age of 61.5 years, 53.6 % male, 54.2 % White, and 49.1 % non-Hispanic or Latino. Study characteristics revealed a variety of protocols across different departments, with cardiology having the highest percentage of participants (46.4 %). Industry funding was the most common (74.5 %), and digital advertising and personal outreach were the main recruitment methods (58.9 % and 90.8 %).</p></div><div><h3>Discussion</h3><p>The analysis demonstrated significant associations between participant characteristics and research participation, including biological sex, age, ethnicity, and language. The stratified analysis revealed other significant associations for recruitment strategies. SDoH is crucial to clinical research recruitment, and this study presents evidence-based solutions for equity and inclusivity. Researchers can tailor recruitment strategies to overcome barriers and increase participant diversity by identifying participant characteristics and research involvement status.</p></div><div><h3>Conclusion</h3><p>The findings highlight the relevance of clinical research inequities and equitable representation of historically underrepresented populations. We need to improve recruitment strategies to promote diversity and inclusivity in research.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"156 ","pages":"Article 104663"},"PeriodicalIF":4.5,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141262095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data harmonization and federated learning for multi-cohort dementia research using the OMOP common data model: A Netherlands consortium of dementia cohorts case study 使用 OMOP 通用数据模型进行多队列痴呆症研究的数据协调和联合学习:荷兰痴呆症队列联盟案例研究。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-05-26 DOI: 10.1016/j.jbi.2024.104661
Pedro Mateus , Justine Moonen , Magdalena Beran , Eva Jaarsma , Sophie M. van der Landen , Joost Heuvelink , Mahlet Birhanu , Alexander G.J. Harms , Esther Bron , Frank J. Wolters , Davy Cats , Hailiang Mei , Julie Oomens , Willemijn Jansen , Miranda T. Schram , Andre Dekker , Inigo Bermejo

Background

Establishing collaborations between cohort studies has been fundamental for progress in health research. However, such collaborations are hampered by heterogeneous data representations across cohorts and legal constraints to data sharing. The first arises from a lack of consensus in standards of data collection and representation across cohort studies and is usually tackled by applying data harmonization processes. The second is increasingly important due to raised awareness for privacy protection and stricter regulations, such as the GDPR. Federated learning has emerged as a privacy-preserving alternative to transferring data between institutions through analyzing data in a decentralized manner.

Methods

In this study, we set up a federated learning infrastructure for a consortium of nine Dutch cohorts with appropriate data available to the etiology of dementia, including an extract, transform, and load (ETL) pipeline for data harmonization. Additionally, we assessed the challenges of transforming and standardizing cohort data using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM) and evaluated our tool in one of the cohorts employing federated algorithms.

Results

We successfully applied our ETL tool and observed a complete coverage of the cohorts’ data by the OMOP CDM. The OMOP CDM facilitated the data representation and standardization, but we identified limitations for cohort-specific data fields and in the scope of the vocabularies available. Specific challenges arise in a multi-cohort federated collaboration due to technical constraints in local environments, data heterogeneity, and lack of direct access to the data.

Conclusion

In this article, we describe the solutions to these challenges and limitations encountered in our study. Our study shows the potential of federated learning as a privacy-preserving solution for multi-cohort studies that enhance reproducibility and reuse of both data and analyses.

背景:在队列研究之间建立合作关系是健康研究取得进展的基础。然而,由于队列研究之间的数据表示不尽相同,而且数据共享受到法律限制,这种合作受到了阻碍。前者是因为队列研究的数据收集和表示标准缺乏共识,通常通过应用数据协调流程来解决。第二种情况由于隐私保护意识的提高和更严格的法规(如 GDPR)而变得越来越重要。通过分散分析数据,联邦学习已成为机构间传输数据的一种保护隐私的替代方法:在这项研究中,我们为一个由九个荷兰队列组成的联合体建立了一个联合学习基础设施,该联合体拥有可用于痴呆病因学研究的适当数据,包括一个用于数据协调的提取、转换和加载(ETL)管道。此外,我们还评估了使用观察性医疗结果合作组织(OMOP)通用数据模型(CDM)对队列数据进行转换和标准化所面临的挑战,并在其中一个采用联合算法的队列中对我们的工具进行了评估:结果:我们成功应用了我们的 ETL 工具,并观察到 OMOP CDM 完全覆盖了队列数据。OMOP CDM为数据表示和标准化提供了便利,但我们也发现了队列特定数据字段和可用词汇范围的局限性。由于当地环境的技术限制、数据异构性以及缺乏对数据的直接访问,在多队列联合协作中出现了具体的挑战:在本文中,我们介绍了在研究中遇到的这些挑战和限制的解决方案。我们的研究显示了联合学习作为多队列研究的隐私保护解决方案的潜力,它能提高数据和分析的可重复性和重复使用性。
{"title":"Data harmonization and federated learning for multi-cohort dementia research using the OMOP common data model: A Netherlands consortium of dementia cohorts case study","authors":"Pedro Mateus ,&nbsp;Justine Moonen ,&nbsp;Magdalena Beran ,&nbsp;Eva Jaarsma ,&nbsp;Sophie M. van der Landen ,&nbsp;Joost Heuvelink ,&nbsp;Mahlet Birhanu ,&nbsp;Alexander G.J. Harms ,&nbsp;Esther Bron ,&nbsp;Frank J. Wolters ,&nbsp;Davy Cats ,&nbsp;Hailiang Mei ,&nbsp;Julie Oomens ,&nbsp;Willemijn Jansen ,&nbsp;Miranda T. Schram ,&nbsp;Andre Dekker ,&nbsp;Inigo Bermejo","doi":"10.1016/j.jbi.2024.104661","DOIUrl":"10.1016/j.jbi.2024.104661","url":null,"abstract":"<div><h3>Background</h3><p>Establishing collaborations between cohort studies has been fundamental for progress in health research. However, such collaborations are hampered by heterogeneous data representations across cohorts and legal constraints to data sharing. The first arises from a lack of consensus in standards of data collection and representation across cohort studies and is usually tackled by applying data harmonization processes. The second is increasingly important due to raised awareness for privacy protection and stricter regulations, such as the GDPR. Federated learning has emerged as a privacy-preserving alternative to transferring data between institutions through analyzing data in a decentralized manner.</p></div><div><h3>Methods</h3><p>In this study, we set up a federated learning infrastructure for a consortium of nine Dutch cohorts with appropriate data available to the etiology of dementia, including an extract, transform, and load (ETL) pipeline for data harmonization. Additionally, we assessed the challenges of transforming and standardizing cohort data using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM) and evaluated our tool in one of the cohorts employing federated algorithms.</p></div><div><h3>Results</h3><p>We successfully applied our ETL tool and observed a complete coverage of the cohorts’ data by the OMOP CDM. The OMOP CDM facilitated the data representation and standardization, but we identified limitations for cohort-specific data fields and in the scope of the vocabularies available. Specific challenges arise in a multi-cohort federated collaboration due to technical constraints in local environments, data heterogeneity, and lack of direct access to the data.</p></div><div><h3>Conclusion</h3><p>In this article, we describe the solutions to these challenges and limitations encountered in our study. Our study shows the potential of federated learning as a privacy-preserving solution for multi-cohort studies that enhance reproducibility and reuse of both data and analyses.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104661"},"PeriodicalIF":4.5,"publicationDate":"2024-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000790/pdfft?md5=427f60e31fbd734fb61c4e9620e9e4d4&pid=1-s2.0-S1532046424000790-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141160484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying erroneous height and weight values from adult electronic health records in the All of Us research program 从 "我们所有人 "研究计划的成人电子健康记录中识别错误的身高和体重值。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-05-23 DOI: 10.1016/j.jbi.2024.104660
Andrew Guide , Lina Sulieman , Shawn Garbett , Robert M Cronin , Matthew Spotnitz , Karthik Natarajan , Robert J. Carroll , Paul Harris , Qingxia Chen

Introduction

Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the All of Us Research Program (All of Us).

Methods

We developed reference charts for adult heights and weights that were stratified on participant sex. Our analysis included 4,076,534 height and 5,207,328 wt measurements from ∼ 150,000 participants. Errors were identified using modified standard deviation scores, differences from their expected values, and significant changes between consecutive measurements. We evaluated our method with chart-reviewed heights (8,092) and weights (9,039) from 250 randomly selected participants and compared it with the current cleaning algorithm in All of Us.

Results

The proposed algorithm classified 1.4 % of height and 1.5 % of weight errors in the full cohort. Sensitivity was 90.4 % (95 % CI: 79.0–96.8 %) for heights and 65.9 % (95 % CI: 56.9–74.1 %) for weights. Precision was 73.4 % (95 % CI: 60.9–83.7 %) for heights and 62.9 (95 % CI: 54.0–71.1 %) for weights. In comparison, the current cleaning algorithm has inferior performance in sensitivity (55.8 %) and precision (16.5 %) for height errors while having higher precision (94.0 %) and lower sensitivity (61.9 %) for weight errors.

Discussion

Our proposed algorithm outperformed in detecting height errors compared to weights. It can serve as a valuable addition to the current All of Us cleaning algorithm for identifying erroneous height values.

引言电子健康记录(EHR)是一种有用的研究数据源,但其可用性受到测量误差的影响。本研究调查了 "我们所有人 "研究项目(All of Us)电子健康记录中成人身高和体重测量的自动错误检测算法:方法:我们开发了成人身高和体重参考图表,并根据参与者的性别进行了分层。我们的分析包括来自 150,000 名参与者的 4,076,534 次身高和 5,207,328 次体重测量结果。我们使用修正的标准偏差评分、与预期值的差异以及连续测量之间的显著变化来识别误差。我们用随机抽取的 250 名参与者中经过图表审查的身高(8092)和体重(9039)对我们的方法进行了评估,并将其与当前《我们所有人》中的清理算法进行了比较:结果:所提出的算法对全部人群中 1.4% 的身高错误和 1.5% 的体重错误进行了分类。身高的灵敏度为 90.4%(95% CI:79.0-96.8%),体重的灵敏度为 65.9%(95% CI:56.9-74.1%)。高度的精确度为 73.4 %(95 % CI:60.9-83.7 %),重量的精确度为 62.9 %(95 % CI:54.0-71.1 %)。相比之下,当前的清理算法在高度误差的灵敏度(55.8%)和精确度(16.5%)方面表现较差,而在权重误差方面精确度较高(94.0%),灵敏度较低(61.9%):我们提出的算法在检测身高误差方面的表现优于权重误差。讨论:与权重相比,我们提出的算法在检测身高错误方面更胜一筹,它可以作为当前 "我们所有人 "清理算法的重要补充,用于识别错误的身高值。
{"title":"Identifying erroneous height and weight values from adult electronic health records in the All of Us research program","authors":"Andrew Guide ,&nbsp;Lina Sulieman ,&nbsp;Shawn Garbett ,&nbsp;Robert M Cronin ,&nbsp;Matthew Spotnitz ,&nbsp;Karthik Natarajan ,&nbsp;Robert J. Carroll ,&nbsp;Paul Harris ,&nbsp;Qingxia Chen","doi":"10.1016/j.jbi.2024.104660","DOIUrl":"10.1016/j.jbi.2024.104660","url":null,"abstract":"<div><h3>Introduction</h3><p>Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the <em>All of Us</em> Research Program (<em>All of Us</em>).</p></div><div><h3>Methods</h3><p>We developed reference charts for adult heights and weights that were stratified on participant sex. Our analysis included 4,076,534 height and 5,207,328 wt measurements from ∼ 150,000 participants. Errors were identified using modified standard deviation scores, differences from their expected values, and significant changes between consecutive measurements. We evaluated our method with chart-reviewed heights (8,092) and weights (9,039) from 250 randomly selected participants and compared it with the current cleaning algorithm in <em>All of Us</em>.</p></div><div><h3>Results</h3><p>The proposed algorithm classified 1.4 % of height and 1.5 % of weight errors in the full cohort. Sensitivity was 90.4 % (95 % CI: 79.0–96.8 %) for heights and 65.9 % (95 % CI: 56.9–74.1 %) for weights. Precision was 73.4 % (95 % CI: 60.9–83.7 %) for heights and 62.9 (95 % CI: 54.0–71.1 %) for weights. In comparison, the current cleaning algorithm has inferior performance in sensitivity (55.8 %) and precision (16.5 %) for height errors while having higher precision (94.0 %) and lower sensitivity (61.9 %) for weight errors.</p></div><div><h3>Discussion</h3><p>Our proposed algorithm outperformed in detecting height errors compared to weights. It can serve as a valuable addition to the current <em>All of Us</em> cleaning algorithm for identifying erroneous height values.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104660"},"PeriodicalIF":4.5,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141093520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Converting OMOP CDM to phenopackets: A model alignment and patient data representation evaluation 将 OMOP CDM 转换为 phenopackets:模型对齐和患者数据表示评估
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-05-21 DOI: 10.1016/j.jbi.2024.104659
Kayla Schiffer-Kane , Cong Liu , Tiffany J. Callahan , Casey Ta , Jordan G. Nestor , Chunhua Weng

Objective

This study aims to promote interoperability in precision medicine and translational research by aligning the Observational Medical Outcomes Partnership (OMOP) and Phenopackets data models. Phenopackets is an expert knowledge-driven schema designed to facilitate the storage and exchange of multimodal patient data, and support downstream analysis. The first goal of this paper is to explore model alignment by characterizing the common data models using a newly developed data transformation process and evaluation method. Second, using OMOP normalized clinical data, we evaluate the mapping of real-world patient data to Phenopackets. We evaluate the suitability of Phenopackets as a patient data representation for real-world clinical cases.

Methods

We identified mappings between OMOP and Phenopackets and applied them to a real patient dataset to assess the transformation’s success. We analyzed gaps between the models and identified key considerations for transforming data between them. Further, to improve ambiguous alignment, we incorporated Unified Medical Language System (UMLS) semantic type-based filtering to direct individual concepts to their most appropriate domain and conducted a domain-expert evaluation of the mapping’s clinical utility.

Results

The OMOP to Phenopacket transformation pipeline was executed for 1,000 Alzheimer’s disease patients and successfully mapped all required entities. However, due to missing values in OMOP for required Phenopacket attributes, 10.2 % of records were lost. The use of UMLS-semantic type filtering for ambiguous alignment of individual concepts resulted in 96 % agreement with clinical thinking, increased from 68 % when mapping exclusively by domain correspondence.

Conclusion

This study presents a pipeline to transform data from OMOP to Phenopackets. We identified considerations for the transformation to ensure data quality, handling restrictions for successful Phenopacket validation and discrepant data formats. We identified unmappable Phenopacket attributes that focus on specialty use cases, such as genomics or oncology, which OMOP does not currently support. We introduce UMLS semantic type filtering to resolve ambiguous alignment to Phenopacket entities to be most appropriate for real-world interpretation. We provide a systematic approach to align OMOP and Phenopackets schemas. Our work facilitates future use of Phenopackets in clinical applications by addressing key barriers to interoperability when deriving a Phenopacket from real-world patient data.

研究目的本研究旨在通过调整观察性医疗结果合作组织(OMOP)和 Phenopackets 数据模型,促进精准医疗和转化研究的互操作性。Phenopackets 是一种专家知识驱动模式,旨在促进多模态患者数据的存储和交换,并支持下游分析。本文的第一个目标是通过使用新开发的数据转换流程和评估方法来描述通用数据模型的特征,从而探索模型的一致性。其次,我们使用 OMOP 归一化临床数据,评估真实世界患者数据与 Phenopackets 的映射。我们评估了 Phenopackets 作为真实世界临床病例患者数据表示的适用性:我们确定了 OMOP 和 Phenopackets 之间的映射,并将其应用于真实患者数据集,以评估转换是否成功。我们分析了两种模型之间的差距,并确定了在两种模型之间转换数据的关键注意事项。此外,为了改善模棱两可的对齐方式,我们采用了统一医学语言系统(UMLS)基于语义类型的过滤方法,将各个概念引导到最合适的领域,并对映射的临床实用性进行了领域专家评估:结果:针对 1,000 名阿尔茨海默病患者执行了 OMOP 到 Phenopacket 的转换管道,并成功映射了所有需要的实体。然而,由于 OMOP 中缺少所需的 Phenopacket 属性值,10.2% 的记录丢失了。使用 UMLS 语义类型过滤法对单个概念进行模糊对齐后,与临床思维的一致性达到 96%,比完全通过领域对应关系进行映射时的 68% 有所提高:本研究提出了一种将数据从 OMOP 转换到 Phenopackets 的方法。我们确定了转换过程中的注意事项,以确保数据质量、处理 Phenopacket 验证成功的限制条件和不一致的数据格式。我们确定了无法应用的 Phenopacket 属性,这些属性侧重于 OMOP 目前不支持的专业用例,如基因组学或肿瘤学。我们引入了 UMLS 语义类型过滤,以解决与 Phenopacket 实体不一致的问题,从而最适合现实世界的解释。我们提供了一种对齐 OMOP 和 Phenopackets 模式的系统方法。我们的工作解决了从真实世界患者数据中导出 Phenopacket 时互操作性的关键障碍,从而促进了 Phenopackets 未来在临床应用中的使用。
{"title":"Converting OMOP CDM to phenopackets: A model alignment and patient data representation evaluation","authors":"Kayla Schiffer-Kane ,&nbsp;Cong Liu ,&nbsp;Tiffany J. Callahan ,&nbsp;Casey Ta ,&nbsp;Jordan G. Nestor ,&nbsp;Chunhua Weng","doi":"10.1016/j.jbi.2024.104659","DOIUrl":"10.1016/j.jbi.2024.104659","url":null,"abstract":"<div><h3>Objective</h3><p>This study aims to promote interoperability in precision medicine and translational research by aligning the Observational Medical Outcomes Partnership (OMOP) and Phenopackets data models. Phenopackets is an expert knowledge-driven schema designed to facilitate the storage and exchange of multimodal patient data, and support downstream analysis. The first goal of this paper is to explore model alignment by characterizing the common data models using a newly developed data transformation process and evaluation method. Second, using OMOP normalized clinical data, we evaluate the mapping of real-world patient data to Phenopackets<strong><em>.</em></strong> We evaluate the suitability of Phenopackets as a patient data representation for real-world clinical cases.</p></div><div><h3>Methods</h3><p>We identified mappings between OMOP and Phenopackets and applied them to a real patient dataset to assess the transformation’s success. We analyzed gaps between the models and identified key considerations for transforming data between them. Further, to improve ambiguous alignment, we incorporated Unified Medical Language System (UMLS) semantic type-based filtering to direct individual concepts to their most appropriate domain and conducted a domain-expert evaluation of the mapping’s clinical utility.</p></div><div><h3>Results</h3><p>The OMOP to Phenopacket transformation pipeline was executed for 1,000 Alzheimer’s disease patients and successfully mapped all required entities. However, due to missing values in OMOP for required Phenopacket attributes, 10.2 % of records were lost. The use of UMLS-semantic type filtering for ambiguous alignment of individual concepts resulted in 96 % agreement with clinical thinking, increased from 68 % when mapping exclusively by domain correspondence.</p></div><div><h3>Conclusion</h3><p>This study presents a pipeline to transform data from OMOP to Phenopackets. We identified considerations for the transformation to ensure data quality, handling restrictions for successful Phenopacket validation and discrepant data formats. We identified unmappable Phenopacket attributes that focus on specialty use cases, such as genomics or oncology, which OMOP does not currently support. We introduce UMLS semantic type filtering to resolve ambiguous alignment to Phenopacket entities to be most appropriate for real-world interpretation. We provide a systematic approach to align OMOP and Phenopackets schemas. Our work facilitates future use of Phenopackets in clinical applications by addressing key barriers to interoperability when deriving a Phenopacket from real-world patient data.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104659"},"PeriodicalIF":4.5,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141081404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal fairness assessment of treatment allocation with electronic health records 利用电子健康记录对治疗分配进行因果公平性评估。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-05-21 DOI: 10.1016/j.jbi.2024.104656
Linying Zhang , Lauren R. Richter , Yixin Wang , Anna Ostropolets , Noémie Elhadad , David M. Blei , George Hripcsak

Objective:

Healthcare continues to grapple with the persistent issue of treatment disparities, sparking concerns regarding the equitable allocation of treatments in clinical practice. While various fairness metrics have emerged to assess fairness in decision-making processes, a growing focus has been on causality-based fairness concepts due to their capacity to mitigate confounding effects and reason about bias. However, the application of causal fairness notions in evaluating the fairness of clinical decision-making with electronic health record (EHR) data remains an understudied domain. This study aims to address the methodological gap in assessing causal fairness of treatment allocation with electronic health records data. In addition, we investigate the impact of social determinants of health on the assessment of causal fairness of treatment allocation.

Methods:

We propose a causal fairness algorithm to assess fairness in clinical decision-making. Our algorithm accounts for the heterogeneity of patient populations and identifies potential unfairness in treatment allocation by conditioning on patients who have the same likelihood to benefit from the treatment. We apply this framework to a patient cohort with coronary artery disease derived from an EHR database to evaluate the fairness of treatment decisions.

Results:

Our analysis reveals notable disparities in coronary artery bypass grafting (CABG) allocation among different patient groups. Women were found to be 4.4%–7.7% less likely to receive CABG than men in two out of four treatment response strata. Similarly, Black or African American patients were 5.4%–8.7% less likely to receive CABG than others in three out of four response strata. These results were similar when social determinants of health (insurance and area deprivation index) were dropped from the algorithm. These findings highlight the presence of disparities in treatment allocation among similar patients, suggesting potential unfairness in the clinical decision-making process.

Conclusion:

This study introduces a novel approach for assessing the fairness of treatment allocation in healthcare. By incorporating responses to treatment into fairness framework, our method explores the potential of quantifying fairness from a causal perspective using EHR data. Our research advances the methodological development of fairness assessment in healthcare and highlight the importance of causality in determining treatment fairness.

目的:医疗保健领域一直存在治疗差异问题,这引发了人们对临床实践中治疗公平分配的关注。虽然出现了各种公平性指标来评估决策过程中的公平性,但基于因果关系的公平性概念因其能够减轻混杂效应和推理偏差而日益受到关注。然而,在利用电子健康记录(EHR)数据评估临床决策公平性时,因果关系公平性概念的应用仍是一个研究不足的领域。本研究旨在解决利用电子健康记录数据评估治疗分配因果公平性的方法学空白。此外,我们还研究了健康的社会决定因素对治疗分配因果公平性评估的影响:我们提出了一种因果公平性算法来评估临床决策的公平性。我们的算法考虑到了患者群体的异质性,并通过对从治疗中获益的可能性相同的患者设定条件来识别治疗分配中可能存在的不公平现象。我们将这一框架应用于来自电子病历数据库的冠心病患者队列,以评估治疗决策的公平性:结果:我们的分析表明,冠状动脉旁路移植术(CABG)在不同患者群体中的分配存在明显差异。在四个治疗反应层中的两个层中,女性接受 CABG 的可能性比男性低 4.4%-7.7%。同样,在四个响应分层中的三个中,黑人或非裔美国人患者接受 CABG 的可能性比其他人低 5.4%-8.7%。如果将健康的社会决定因素(保险和地区贫困指数)从算法中剔除,这些结果也是相似的。这些发现凸显了类似患者在治疗分配上的差异,表明临床决策过程中可能存在不公平现象:本研究引入了一种新方法来评估医疗保健中治疗分配的公平性。通过将对治疗的反应纳入公平性框架,我们的方法探索了利用电子病历数据从因果角度量化公平性的潜力。我们的研究推动了医疗公平性评估方法的发展,并强调了因果关系在确定治疗公平性中的重要性。
{"title":"Causal fairness assessment of treatment allocation with electronic health records","authors":"Linying Zhang ,&nbsp;Lauren R. Richter ,&nbsp;Yixin Wang ,&nbsp;Anna Ostropolets ,&nbsp;Noémie Elhadad ,&nbsp;David M. Blei ,&nbsp;George Hripcsak","doi":"10.1016/j.jbi.2024.104656","DOIUrl":"10.1016/j.jbi.2024.104656","url":null,"abstract":"<div><h3>Objective:</h3><p>Healthcare continues to grapple with the persistent issue of treatment disparities, sparking concerns regarding the equitable allocation of treatments in clinical practice. While various fairness metrics have emerged to assess fairness in decision-making processes, a growing focus has been on causality-based fairness concepts due to their capacity to mitigate confounding effects and reason about bias. However, the application of causal fairness notions in evaluating the fairness of clinical decision-making with electronic health record (EHR) data remains an understudied domain. This study aims to address the methodological gap in assessing causal fairness of treatment allocation with electronic health records data. In addition, we investigate the impact of social determinants of health on the assessment of causal fairness of treatment allocation.</p></div><div><h3>Methods:</h3><p>We propose a causal fairness algorithm to assess fairness in clinical decision-making. Our algorithm accounts for the heterogeneity of patient populations and identifies potential unfairness in treatment allocation by conditioning on patients who have the same likelihood to benefit from the treatment. We apply this framework to a patient cohort with coronary artery disease derived from an EHR database to evaluate the fairness of treatment decisions.</p></div><div><h3>Results:</h3><p>Our analysis reveals notable disparities in coronary artery bypass grafting (CABG) allocation among different patient groups. Women were found to be 4.4%–7.7% less likely to receive CABG than men in two out of four treatment response strata. Similarly, Black or African American patients were 5.4%–8.7% less likely to receive CABG than others in three out of four response strata. These results were similar when social determinants of health (insurance and area deprivation index) were dropped from the algorithm. These findings highlight the presence of disparities in treatment allocation among similar patients, suggesting potential unfairness in the clinical decision-making process.</p></div><div><h3>Conclusion:</h3><p>This study introduces a novel approach for assessing the fairness of treatment allocation in healthcare. By incorporating responses to treatment into fairness framework, our method explores the potential of quantifying fairness from a causal perspective using EHR data. Our research advances the methodological development of fairness assessment in healthcare and highlight the importance of causality in determining treatment fairness.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104656"},"PeriodicalIF":4.5,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000741/pdfft?md5=d37cbe440b1ae272380ac3b6a7b28597&pid=1-s2.0-S1532046424000741-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141087213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing the coverage of SemRep using a relation classification approach 使用关系分类方法增强 SemRep 的覆盖范围。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-05-21 DOI: 10.1016/j.jbi.2024.104658
Shufan Ming , Rui Zhang , Halil Kilicoglu

Objective:

Relation extraction is an essential task in the field of biomedical literature mining and offers significant benefits for various downstream applications, including database curation, drug repurposing, and literature-based discovery. The broad-coverage natural language processing (NLP) tool SemRep has established a solid baseline for extracting subject–predicate–object triples from biomedical text and has served as the backbone of the Semantic MEDLINE Database (SemMedDB), a PubMed-scale repository of semantic triples. While SemRep achieves reasonable precision (0.69), its recall is relatively low (0.42). In this study, we aimed to enhance SemRep using a relation classification approach, in order to eventually increase the size and the utility of SemMedDB.

Methods:

We combined and extended existing SemRep evaluation datasets to generate training data. We leveraged the pre-trained PubMedBERT model, enhancing it through additional contrastive pre-training and fine-tuning. We experimented with three entity representations: mentions, semantic types, and semantic groups. We evaluated the model performance on a portion of the SemRep Gold Standard dataset and compared it to SemRep performance. We also assessed the effect of the model on a larger set of 12K randomly selected PubMed abstracts.

Results:

Our results show that the best model yields a precision of 0.62, recall of 0.81, and F1 score of 0.70. Assessment on 12K abstracts shows that the model could double the size of SemMedDB, when applied to entire PubMed. We also manually assessed the quality of 506 triples predicted by the model that SemRep had not previously identified, and found that 67% of these triples were correct.

Conclusion:

These findings underscore the promise of our model in achieving a more comprehensive coverage of relationships mentioned in biomedical literature, thereby showing its potential in enhancing various downstream applications of biomedical literature mining. Data and code related to this study are available at https://github.com/Michelle-Mings/SemRep_RelationClassification.

目的:关系提取是生物医学文献挖掘领域的一项基本任务,可为数据库整理、药物再利用和基于文献的发现等各种下游应用带来显著优势。覆盖范围广泛的自然语言处理(NLP)工具SemRep为从生物医学文本中提取主谓宾三元组奠定了坚实的基础,并已成为语义MEDLINE数据库(SemMedDB)--一个PubMed规模的语义三元组存储库--的支柱。虽然SemRep达到了合理的精确度(0.69),但其召回率相对较低(0.42)。在本研究中,我们旨在使用关系分类方法来增强 SemRep,以便最终扩大 SemMedDB 的规模并提高其实用性:方法:我们结合并扩展了现有的SemRep评估数据集,以生成训练数据。我们利用预训练的PubMedBERT模型,通过额外的对比预训练和微调来增强该模型。我们试验了三种实体表征:提及、语义类型和语义组。我们在部分 SemRep 黄金标准数据集上评估了模型的性能,并将其与 SemRep 的性能进行了比较。我们还评估了该模型在更大的 12K 随机选取的 PubMed 摘要集上的效果:我们的结果表明,最佳模型的精确度为 0.62,召回率为 0.81,F1 得分为 0.70。对1.2万份摘要的评估结果表明,如果将该模型应用于整个PubMed,它可以使SemMedDB的规模扩大一倍。我们还对模型预测的506个三元组的质量进行了人工评估,发现其中67%的三元组是正确的:这些发现凸显了我们的模型在更全面地覆盖生物医学文献中提到的关系方面的前景,从而显示了它在加强生物医学文献挖掘的各种下游应用方面的潜力。本研究的相关数据和代码可在 https://github.com/Michelle-Mings/SemRep_RelationClassification 上获取。
{"title":"Enhancing the coverage of SemRep using a relation classification approach","authors":"Shufan Ming ,&nbsp;Rui Zhang ,&nbsp;Halil Kilicoglu","doi":"10.1016/j.jbi.2024.104658","DOIUrl":"10.1016/j.jbi.2024.104658","url":null,"abstract":"<div><h3>Objective:</h3><p>Relation extraction is an essential task in the field of biomedical literature mining and offers significant benefits for various downstream applications, including database curation, drug repurposing, and literature-based discovery. The broad-coverage natural language processing (NLP) tool SemRep has established a solid baseline for extracting subject–predicate–object triples from biomedical text and has served as the backbone of the Semantic MEDLINE Database (SemMedDB), a PubMed-scale repository of semantic triples. While SemRep achieves reasonable precision (0.69), its recall is relatively low (0.42). In this study, we aimed to enhance SemRep using a relation classification approach, in order to eventually increase the size and the utility of SemMedDB.</p></div><div><h3>Methods:</h3><p>We combined and extended existing SemRep evaluation datasets to generate training data. We leveraged the pre-trained PubMedBERT model, enhancing it through additional contrastive pre-training and fine-tuning. We experimented with three entity representations: mentions, semantic types, and semantic groups. We evaluated the model performance on a portion of the SemRep Gold Standard dataset and compared it to SemRep performance. We also assessed the effect of the model on a larger set of 12K randomly selected PubMed abstracts.</p></div><div><h3>Results:</h3><p>Our results show that the best model yields a precision of 0.62, recall of 0.81, and F<sub>1</sub> score of 0.70. Assessment on 12K abstracts shows that the model could double the size of SemMedDB, when applied to entire PubMed. We also manually assessed the quality of 506 triples predicted by the model that SemRep had not previously identified, and found that 67% of these triples were correct.</p></div><div><h3>Conclusion:</h3><p>These findings underscore the promise of our model in achieving a more comprehensive coverage of relationships mentioned in biomedical literature, thereby showing its potential in enhancing various downstream applications of biomedical literature mining. Data and code related to this study are available at <span>https://github.com/Michelle-Mings/SemRep_RelationClassification</span><svg><path></path></svg>.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104658"},"PeriodicalIF":4.5,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1532046424000765/pdfft?md5=77a4a794708142e6489b4749a7eb4bc1&pid=1-s2.0-S1532046424000765-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141087214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using a clinical narrative-aware pre-trained language model for predicting emergency department patient disposition and unscheduled return visits 使用临床叙事感知预训练语言模型预测急诊科病人处置和计划外回访。
IF 4.5 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-05-19 DOI: 10.1016/j.jbi.2024.104657
Tzu-Ying Chen , Ting-Yun Huang , Yung-Chun Chang

The increasing prevalence of overcrowding in Emergency Departments (EDs) threatens the effective delivery of urgent healthcare. Mitigation strategies include the deployment of monitoring systems capable of tracking and managing patient disposition to facilitate appropriate and timely care, which subsequently reduces patient revisits, optimizes resource allocation, and enhances patient outcomes. This study used ∼ 250,000 emergency department visit records from Taipei Medical University-Shuang Ho Hospital to develop a natural language processing model using BlueBERT, a biomedical domain-specific pre-trained language model, to predict patient disposition status and unplanned readmissions. Data preprocessing and the integration of both structured and unstructured data were central to our approach. Compared to other models, BlueBERT outperformed due to its pre-training on a diverse range of medical literature, enabling it to better comprehend the specialized terminology, relationships, and context present in ED data. We found that translating Chinese-English clinical narratives into English and textualizing numerical data into categorical representations significantly improved the prediction of patient disposition (AUROC = 0.9014) and 72-hour unscheduled return visits (AUROC = 0.6475). The study concludes that the BlueBERT-based model demonstrated superior prediction capabilities, surpassing the performance of prior patient disposition predictive models, thus offering promising applications in the realm of ED clinical practice.

急诊科(ED)人满为患的现象日益普遍,威胁着紧急医疗服务的有效提供。缓解策略包括部署能够跟踪和管理病人处置情况的监控系统,以促进适当和及时的护理,从而减少病人再次就诊,优化资源分配,提高病人的治疗效果。本研究利用台北医学大学双和医院的 250,000 份急诊科就诊记录,使用生物医学领域特定的预训练语言模型 BlueBERT 开发了一个自然语言处理模型,用于预测患者处置状态和非计划再入院情况。数据预处理以及结构化和非结构化数据的整合是我们方法的核心。与其他模型相比,BlueBERT 的表现更胜一筹,这得益于它对各种医学文献的预训练,使其能够更好地理解急诊室数据中的专业术语、关系和上下文。我们发现,将中英文临床叙述翻译成英文,并将数字数据文本化为分类表示,可显著改善对患者处置(AUROC = 0.9014)和 72 小时计划外回访(AUROC = 0.6475)的预测。研究得出结论,基于 BlueBERT 的模型显示出卓越的预测能力,超越了之前的患者处置预测模型,因此在急诊室临床实践中具有广阔的应用前景。
{"title":"Using a clinical narrative-aware pre-trained language model for predicting emergency department patient disposition and unscheduled return visits","authors":"Tzu-Ying Chen ,&nbsp;Ting-Yun Huang ,&nbsp;Yung-Chun Chang","doi":"10.1016/j.jbi.2024.104657","DOIUrl":"10.1016/j.jbi.2024.104657","url":null,"abstract":"<div><p>The increasing prevalence of overcrowding in Emergency Departments (EDs) threatens the effective delivery of urgent healthcare. Mitigation strategies include the deployment of monitoring systems capable of tracking and managing patient disposition to facilitate appropriate and timely care, which subsequently reduces patient revisits, optimizes resource allocation, and enhances patient outcomes. This study used ∼ 250,000 emergency department visit records from Taipei Medical University-Shuang Ho Hospital to develop a natural language processing model using BlueBERT, a biomedical domain-specific pre-trained language model, to predict patient disposition status and unplanned readmissions. Data preprocessing and the integration of both structured and unstructured data were central to our approach. Compared to other models, BlueBERT outperformed due to its pre-training on a diverse range of medical literature, enabling it to better comprehend the specialized terminology, relationships, and context present in ED data. We found that translating Chinese-English clinical narratives into English and textualizing numerical data into categorical representations significantly improved the prediction of patient disposition (AUROC = 0.9014) and 72-hour unscheduled return visits (AUROC = 0.6475). The study concludes that the BlueBERT-based model demonstrated superior prediction capabilities, surpassing the performance of prior patient disposition predictive models, thus offering promising applications in the realm of ED clinical practice.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104657"},"PeriodicalIF":4.5,"publicationDate":"2024-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141076040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Biomedical Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1