首页 > 最新文献

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science最新文献

英文 中文
Comparative Analysis of Fusion Strategies for Imaging and Non-imaging Data - Use-case of Hospital Discharge Prediction. 成像与非成像数据融合策略的比较分析--以出院预测为例。
Vedant Parikh, Amara Tariq, Bhavik Patel, Imon Banerjee

Accurate prediction of future clinical events such as discharge from hospital can not only improve hospital resource management but also provide an indicator of a patient's clinical condition. Within the scope of this work, we perform a comparative analysis of deep learning based fusion strategies against traditional single source models for prediction of discharge from hospital by fusing information encoded in two diverse but relevant data modalities, i.e., chest X-ray images and tabular electronic health records (EHR). We evaluate multiple fusion strategies including late, early and joint fusion in terms of their efficacy for target prediction compared to EHR-only and Image-only predictive models. Results indicated the importance of merging information from two modalities for prediction as fusion models tended to outperform single modality models and indicate that the joint fusion scheme was the most effective for target prediction. Joint fusion model merges the two modalities through a branched neural network that is jointly trained in an end-to-end fashion to extract target-relevant information from both modalities.

准确预测未来的临床事件(如出院)不仅能改善医院资源管理,还能提供患者临床状况的指标。在这项工作的范围内,我们通过融合两种不同但相关的数据模式(即胸部 X 光图像和表格式电子健康记录 (EHR))中编码的信息,对基于深度学习的融合策略与传统的单源模型进行了比较分析,以预测出院情况。与纯电子病历和纯图像预测模型相比,我们评估了多种融合策略(包括后期融合、早期融合和联合融合)对目标预测的功效。结果表明,融合两种模式的信息对于预测非常重要,因为融合模型往往优于单一模式模型,并表明联合融合方案对目标预测最为有效。联合融合模型通过一个分支神经网络融合两种模态,该网络以端到端方式进行联合训练,从两种模态中提取目标相关信息。
{"title":"Comparative Analysis of Fusion Strategies for Imaging and Non-imaging Data - Use-case of Hospital Discharge Prediction.","authors":"Vedant Parikh, Amara Tariq, Bhavik Patel, Imon Banerjee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Accurate prediction of future clinical events such as discharge from hospital can not only improve hospital resource management but also provide an indicator of a patient's clinical condition. Within the scope of this work, we perform a comparative analysis of deep learning based fusion strategies against traditional single source models for prediction of discharge from hospital by fusing information encoded in two diverse but relevant data modalities, i.e., chest X-ray images and tabular electronic health records (EHR). We evaluate multiple fusion strategies including late, early and joint fusion in terms of their efficacy for target prediction compared to EHR-only and Image-only predictive models. Results indicated the importance of merging information from two modalities for prediction as fusion models tended to outperform single modality models and indicate that the joint fusion scheme was the most effective for target prediction. Joint fusion model merges the two modalities through a branched neural network that is jointly trained in an end-to-end fashion to extract target-relevant information from both modalities.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"652-661"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141810/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141199535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and Validation of an Individual Socioeconomic Deprivation Index (ISDI) in the NIH's All of Us Data Network. 在美国国立卫生研究院的 "我们所有人 "数据网络中开发和验证个人社会经济贫困指数 (ISDI)。
Nripendra Acharya, Karthik Natarajan

Many of the existing composite social determinant of health indices, such as Area Deprivation Index, are constrained by their reliance on geographic approximations and American Community Survey data. This study builds on the body of literature around deprivation indices to construct an individual socioeconomic deprivation index (ISDI) within the NIH's All of Us Data Network by using weighted multiple correspondence analysis on SDOH data elements collected at the participant level. In this study, the correlation between ISDI and another area-approximated index is assessed to the extent possible, along with the changes in an AI models performance due to stratified sampling based on ISDI quintiles. Individual level deprivation indices may have a wide range of utility particularly in the context of precision medicine in both centralized and distributed data networks.

许多现有的健康社会决定因素综合指数(如地区贫困指数)都因依赖于地理近似值和美国社区调查数据而受到限制。本研究以有关贫困指数的大量文献为基础,在美国国立卫生研究院(NIH)的 "我们所有人 "数据网络(All of Us Data Network)中,通过对在参与者层面收集的 SDOH 数据元素进行加权多重对应分析,构建了个人社会经济贫困指数(ISDI)。在本研究中,将尽可能评估 ISDI 与另一个地区近似指数之间的相关性,以及基于 ISDI 五分位数的分层抽样导致的人工智能模型性能变化。个人层面的贫困指数可能具有广泛的实用性,尤其是在集中式和分布式数据网络中的精准医疗方面。
{"title":"Development and Validation of an Individual Socioeconomic Deprivation Index (ISDI) in the NIH's <i>All of Us</i> Data Network.","authors":"Nripendra Acharya, Karthik Natarajan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Many of the existing composite social determinant of health indices, such as Area Deprivation Index, are constrained by their reliance on geographic approximations and American Community Survey data. This study builds on the body of literature around deprivation indices to construct an individual socioeconomic deprivation index (ISDI) within the NIH's All of Us Data Network by using weighted multiple correspondence analysis on SDOH data elements collected at the participant level. In this study, the correlation between ISDI and another area-approximated index is assessed to the extent possible, along with the changes in an AI models performance due to stratified sampling based on ISDI quintiles. Individual level deprivation indices may have a wide range of utility particularly in the context of precision medicine in both centralized and distributed data networks.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"36-45"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Driving Precision of Pediatric VTE Risk-stratification through Genetics. 通过遗传学推动儿科 VTE 风险分级的精确性。
Samaya S Badrieh, Lisa Bastarache, Xinnan Niu, Jing He, Jamie R Robinson

This study addresses rising incidence of pediatric venous thromboembolism by validating a VTE phenotype and developing a polygenic risk score (PRS) using UK Biobank data. Our findings demonstrate predictive value of the PRS, enhancing VTE risk assessment in clinical settings. Future steps involve integrating the PRS into risk stratification models.

本研究利用英国生物库数据验证了 VTE 表型并制定了多基因风险评分 (PRS),从而解决了儿科静脉血栓栓塞发病率上升的问题。我们的研究结果证明了多基因风险评分的预测价值,从而加强了临床环境中的 VTE 风险评估。未来的工作包括将多基因风险评分纳入风险分层模型。
{"title":"Driving Precision of Pediatric VTE Risk-stratification through Genetics.","authors":"Samaya S Badrieh, Lisa Bastarache, Xinnan Niu, Jing He, Jamie R Robinson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study addresses rising incidence of pediatric venous thromboembolism by validating a VTE phenotype and developing a polygenic risk score (PRS) using UK Biobank data. Our findings demonstrate predictive value of the PRS, enhancing VTE risk assessment in clinical settings. Future steps involve integrating the PRS into risk stratification models.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"498"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141856/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pre-test Prediction of Non-ischemic Cardiomyopathies using Time-Series EHR Data. 利用时间序列电子病历数据对非缺血性心肌病进行测试前预测。
Kary Ishwaran, Bryan Q Abadie, Po-Hao Chen, Michael Bolen, Tara Karamlou, Richard Grimm, W H Wilson Tang, Christopher Nguyen, Deborah Kwon, David Chen

Clinical imaging is an important diagnostic test to diagnose non-ischemic cardiomyopathies (NICM). However, accurate interpretation of imaging studies often requires readers to review patient histories, a time consuming and tedious task. We propose to use time-series analysis to predict the most likely NICMs using longitudinal electronic health records (EHR) as a pseudo-summary of EHR records. Time-series formatted EHR data can provide temporality information important towards accurate prediction of disease. Specifically, we leverage ICD-10 codes and various recurrent neural network architectures for predictive modeling. We trained our models on a large cohort of NICM patients who underwent cardiac magnetic resonance imaging (CMR) and a smaller cohort undergoing echocardiogram. The performance of the proposed technique achieved good micro-area under the curve (0.8357), F1 score (0.5708) and precision at 3 (0.8078) across all models for cardiac magnetic resonance imaging (CMR) but only moderate performance for transthoracic echocardiogram (TTE) of 0.6938, 0.4399 and 0.5864 respectively. We show that our model has the potential to provide accurate pre-test differential diagnosis, thereby potentially reducing clerical burden on physicians.

临床影像学检查是诊断非缺血性心肌病(NICM)的重要诊断方法。然而,要准确解读影像学检查结果,读者往往需要回顾患者病史,这是一项耗时且繁琐的工作。我们建议使用时间序列分析法,利用纵向电子健康记录(EHR)作为 EHR 记录的伪摘要来预测最有可能发生的 NICM。时间序列格式的电子病历数据可以提供对准确预测疾病非常重要的时间信息。具体来说,我们利用 ICD-10 编码和各种递归神经网络架构进行预测建模。我们在一大批接受心脏磁共振成像(CMR)检查的 NICM 患者和一小批接受超声心动图检查的患者身上训练了我们的模型。在心脏磁共振成像(CMR)的所有模型中,所提出技术的微曲线下面积(0.8357)、F1 分数(0.5708)和 3 倍精度(0.8078)均表现良好,但在经胸超声心动图(TTE)中表现一般,分别为 0.6938、0.4399 和 0.5864。我们的研究表明,我们的模型有可能提供准确的检查前鉴别诊断,从而减轻医生的文书工作负担。
{"title":"Pre-test Prediction of Non-ischemic Cardiomyopathies using Time-Series EHR Data.","authors":"Kary Ishwaran, Bryan Q Abadie, Po-Hao Chen, Michael Bolen, Tara Karamlou, Richard Grimm, W H Wilson Tang, Christopher Nguyen, Deborah Kwon, David Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical imaging is an important diagnostic test to diagnose non-ischemic cardiomyopathies (NICM). However, accurate interpretation of imaging studies often requires readers to review patient histories, a time consuming and tedious task. We propose to use time-series analysis to predict the most likely NICMs using longitudinal electronic health records (EHR) as a pseudo-summary of EHR records. Time-series formatted EHR data can provide temporality information important towards accurate prediction of disease. Specifically, we leverage ICD-10 codes and various recurrent neural network architectures for predictive modeling. We trained our models on a large cohort of NICM patients who underwent cardiac magnetic resonance imaging (CMR) and a smaller cohort undergoing echocardiogram. The performance of the proposed technique achieved good micro-area under the curve (0.8357), F1 score (0.5708) and precision at 3 (0.8078) across all models for cardiac magnetic resonance imaging (CMR) but only moderate performance for transthoracic echocardiogram (TTE) of 0.6938, 0.4399 and 0.5864 respectively. We show that our model has the potential to provide accurate pre-test differential diagnosis, thereby potentially reducing clerical burden on physicians.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"239-248"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141858/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Clinical Predictive Modeling through Model Complexity-Driven Class Proportion Tuning for Class Imbalanced Data: An Empirical Study on Opioid Overdose Prediction. 针对类不平衡数据,通过模型复杂性驱动的类比例调整增强临床预测建模:阿片类药物过量预测实证研究》。
Yinan Liu, Xinyu Dong, Weimin Lyu, Richard N Rosenthal, Rachel Wong, Tengfei Ma, Jun Kong, Fusheng Wang

Class imbalance issues are prevalent in the medical field and significantly impact the performance of clinical predictive models. Traditional techniques to address this challenge aim to rebalance class proportions. They generally assume that the rebalanced proportions are derived from the original data, without considering the intricacies of the model utilized. This study challenges the prevailing assumption and introduces a new method that ties the optimal class proportions to model complexity. This approach allows for individualized tuning of class proportions for each model. Our experiments, centered on the opioid overdose prediction problem, highlight the performance gains achieved by this approach. Furthermore, rigorous regression analysis affirms the merits of the proposed theoretical framework, demonstrating a statistically significant correlation between hyperparameters controlling model complexity and the optimal class proportions.

类不平衡问题在医学领域非常普遍,严重影响临床预测模型的性能。应对这一挑战的传统技术旨在重新平衡类别比例。它们通常假设重新平衡的比例来自原始数据,而不考虑所使用模型的复杂性。本研究对这一普遍假设提出了挑战,并引入了一种新方法,将最佳类别比例与模型复杂性联系起来。这种方法允许对每个模型的类比例进行个性化调整。我们的实验以阿片类药物过量预测问题为中心,强调了这种方法所带来的性能提升。此外,严格的回归分析证实了所提出的理论框架的优点,证明了控制模型复杂性的超参数与最佳类别比例之间存在统计学意义上的显著相关性。
{"title":"Enhancing Clinical Predictive Modeling through Model Complexity-Driven Class Proportion Tuning for Class Imbalanced Data: An Empirical Study on Opioid Overdose Prediction.","authors":"Yinan Liu, Xinyu Dong, Weimin Lyu, Richard N Rosenthal, Rachel Wong, Tengfei Ma, Jun Kong, Fusheng Wang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Class imbalance issues are prevalent in the medical field and significantly impact the performance of clinical predictive models. Traditional techniques to address this challenge aim to rebalance class proportions. They generally assume that the rebalanced proportions are derived from the original data, without considering the intricacies of the model utilized. This study challenges the prevailing assumption and introduces a new method that ties the optimal class proportions to model complexity. This approach allows for individualized tuning of class proportions for each model. Our experiments, centered on the opioid overdose prediction problem, highlight the performance gains achieved by this approach. Furthermore, rigorous regression analysis affirms the merits of the proposed theoretical framework, demonstrating a statistically significant correlation between hyperparameters controlling model complexity and the optimal class proportions.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"334-343"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141828/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring and Reducing Racial Bias in a Pediatric Urinary Tract Infection Model. 在小儿尿路感染模型中测量并减少种族偏见。
Joshua W Anderson, Nader Shaikh, Shyam Visweswaran

Clinical predictive models that include race as a predictor have the potential to exacerbate disparities in healthcare. Such models can be respecified to exclude race or optimized to reduce racial bias. We investigated the impact of such respecifications in a predictive model - UTICalc - which was designed to reduce catheterizations in young children with suspected urinary tract infections. To reduce racial bias, race was removed from the UTICalc logistic regression model and replaced with two new features. We compared the two versions of UTICalc using fairness and predictive performance metrics to understand the effects on racial bias. In addition, we derived three new models for UTICalc to specifically improve racial fairness. Our results show that, as predicted by previously described impossibility results, fairness cannot be simultaneously improved on all fairness metrics, and model respecification may improve racial fairness but decrease overall predictive performance.

将种族作为预测因素的临床预测模型有可能加剧医疗保健中的差异。可以对此类模型进行重新设计,排除种族因素,或对其进行优化,以减少种族偏见。我们在一个预测模型--UTICalc--中研究了这种重新设计的影响,该模型旨在减少疑似尿路感染的幼儿导管插入术。为了减少种族偏差,UTICalc 逻辑回归模型中删除了种族,代之以两个新特征。我们使用公平性和预测性能指标对两个版本的UTICalc进行了比较,以了解对种族偏见的影响。此外,我们还为UTICalc 建立了三个新模型,以专门改善种族公平性。我们的结果表明,正如之前描述的不可能性结果所预测的那样,公平性不可能在所有公平性指标上同时得到改善,模型的重新设计可能会改善种族公平性,但会降低整体预测性能。
{"title":"Measuring and Reducing Racial Bias in a Pediatric Urinary Tract Infection Model.","authors":"Joshua W Anderson, Nader Shaikh, Shyam Visweswaran","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical predictive models that include race as a predictor have the potential to exacerbate disparities in healthcare. Such models can be respecified to exclude race or optimized to reduce racial bias. We investigated the impact of such respecifications in a predictive model - UTICalc - which was designed to reduce catheterizations in young children with suspected urinary tract infections. To reduce racial bias, race was removed from the UTICalc logistic regression model and replaced with two new features. We compared the two versions of UTICalc using fairness and predictive performance metrics to understand the effects on racial bias. In addition, we derived three new models for UTICalc to specifically improve racial fairness. Our results show that, as predicted by previously described impossibility results, fairness cannot be simultaneously improved on all fairness metrics, and model respecification may improve racial fairness but decrease overall predictive performance.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"488-497"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141814/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Best of Both Worlds: Bridging One Model for All and Group-Specific Model Approaches using Ensemble-based Subpopulation Modeling. 两全其美:利用基于集合的子群体建模,将 "一个模型适用于所有群体 "和 "特定群体模型 "方法结合起来。
Purity Mugambi, Stephanie Carreiro

Subpopulation models have become of increasing interest in prediction of clinical outcomes because they promise to perform better for underrepresented patient subgroups. However, the personalization benefits gained from these models tradeoff their statistical power, and can be impractical when the subpopulation's sample size is small. We hypothesize that a hierarchical model in which population information is integrated into subpopulation models would preserve the personalization benefits and offset the loss of power. In this work, we integrate ideas from ensemble modeling, personalization, and hierarchical modeling and build ensemble-based subpopulation models in which specialization relies on whole group samples. This approach significantly improves the precision of the positive class, especially for the underrepresented subgroups, with minimal cost to the recall. It consistently outperforms one model for all and one model for each subgroup approaches, especially in the presence of a high class-imbalance, for subgroups with at least 380 training samples.

亚群模型在预测临床结果方面越来越受到关注,因为它们有望为代表性不足的患者亚群提供更好的服务。然而,从这些模型中获得的个性化优势折损了它们的统计能力,而且当亚人群样本量较小时,这些模型可能并不实用。我们假设,将群体信息整合到亚群体模型中的分层模型将保留个性化优势,并抵消统计能力的损失。在这项工作中,我们整合了集合建模、个性化和分层建模的思想,建立了基于集合的子群模型,其中的专业化依赖于整个群体样本。这种方法大大提高了正向类的精确度,尤其是对于代表性不足的子群,而召回率的代价却很小。对于至少有 380 个训练样本的子群来说,它的效果始终优于一个模型适用于所有子群和一个模型适用于每个子群的方法,尤其是在存在高度类不平衡的情况下。
{"title":"Best of Both Worlds: Bridging One Model for All and Group-Specific Model Approaches using Ensemble-based Subpopulation Modeling.","authors":"Purity Mugambi, Stephanie Carreiro","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Subpopulation models have become of increasing interest in prediction of clinical outcomes because they promise to perform better for underrepresented patient subgroups. However, the personalization benefits gained from these models tradeoff their statistical power, and can be impractical when the subpopulation's sample size is small. We hypothesize that a hierarchical model in which population information is integrated into subpopulation models would preserve the personalization benefits and offset the loss of power. In this work, we integrate ideas from ensemble modeling, personalization, and hierarchical modeling and build ensemble-based subpopulation models in which specialization relies on whole group samples. This approach significantly improves the precision of the positive class, especially for the underrepresented subgroups, with minimal cost to the recall. It consistently outperforms one model for all and one model for each subgroup approaches, especially in the presence of a high class-imbalance, for subgroups with at least 380 training samples.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"354-363"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clarifying Chronic Obstructive Pulmonary Disease Genetic Associations Observed in Biobanks via Mediation Analysis of Smoking. 通过对吸烟的中介分析澄清生物库中观察到的慢性阻塞性肺病遗传关联。
Katrina Bazemore, Jaehyun Joo, Wei-Ting Hwang, Blanca E Himes

Varying case definitions of COPD have heterogenous genetic risk profiles, potentially reflective of disease subtypes or classification bias (e.g., smokers more likely to be diagnosed with COPD). To better understand differences in genetic loci associated with ICD-defined versus spirometry-defined COPD we contrasted their GWAS results with those for heavy smoking among 337,138 UK Biobank participants. Overlapping risk loci were found in/near the genes ZEB2, FAM136B, CHRNA3, and CHRNA4, with the CHRNA3 locus shared across all three traits. Mediation analysis to estimate the effects of lead genotyped variants mediated by smoking found significant indirect effects for the FAM136B, CHRNA3, and CHRNA4 loci for both COPD definitions. Adjustment for mediator-outcome confounders modestly attenuated indirect effects, though in the CHRNA4 locus for spirometry-defined COPD the proportion mediated increased an additional 8.47%. Our results suggest that differences between ICD-defined and spirometry-defined COPD associated genetic loci are not a result of smoking biasing classification.

不同病例定义的慢性阻塞性肺病具有不同的遗传风险特征,这可能反映了疾病亚型或分类偏差(例如,吸烟者更有可能被诊断为慢性阻塞性肺病)。为了更好地了解与 ICD 定义的慢性阻塞性肺病相关的遗传位点与肺活量测定定义的慢性阻塞性肺病相关的遗传位点之间的差异,我们将其 GWAS 结果与 337 138 名英国生物库参与者中重度吸烟者的 GWAS 结果进行了对比。在 ZEB2、FAM136B、CHRNA3 和 CHRNA4 基因中/附近发现了重叠的风险基因位点,其中 CHRNA3 基因位点在所有三个性状中共享。通过中介分析来估计由吸烟介导的铅基因分型变异的影响,发现 FAM136B、CHRNA3 和 CHRNA4 基因座对两个慢性阻塞性肺病定义都有显著的间接影响。对介导因素-结果混杂因素的调整适度减弱了间接效应,但在CHRNA4位点上,对于肺活量测定定义的慢性阻塞性肺病,介导的比例额外增加了8.47%。我们的研究结果表明,ICD 定义的慢性阻塞性肺病与肺活量测定定义的慢性阻塞性肺病相关基因位点之间的差异并不是吸烟导致分类偏差的结果。
{"title":"Clarifying Chronic Obstructive Pulmonary Disease Genetic Associations Observed in Biobanks via Mediation Analysis of Smoking.","authors":"Katrina Bazemore, Jaehyun Joo, Wei-Ting Hwang, Blanca E Himes","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Varying case definitions of COPD have heterogenous genetic risk profiles, potentially reflective of disease subtypes or classification bias (e.g., smokers more likely to be diagnosed with COPD). To better understand differences in genetic loci associated with ICD-defined versus spirometry-defined COPD we contrasted their GWAS results with those for heavy smoking among 337,138 UK Biobank participants. Overlapping risk loci were found in/near the genes ZEB2, FAM136B, CHRNA3, and CHRNA4, with the CHRNA3 locus shared across all three traits. Mediation analysis to estimate the effects of lead genotyped variants mediated by smoking found significant indirect effects for the FAM136B, CHRNA3, and CHRNA4 loci for both COPD definitions. Adjustment for mediator-outcome confounders modestly attenuated indirect effects, though in the CHRNA4 locus for spirometry-defined COPD the proportion mediated increased an additional 8.47%. Our results suggest that differences between ICD-defined and spirometry-defined COPD associated genetic loci are not a result of smoking biasing classification.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"499-508"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141198537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PFERM: A Fair Empirical Risk Minimization Approach with Prior Knowledge. PFERM:有先验知识的公平经验风险最小化方法。
Bojian Hou, Andrés Mondragón, Davoud Ataee Tarzanagh, Zhuoping Zhou, Andrew J Saykin, Jason H Moore, Marylyn D Ritchie, Qi Long, Li Shen

Fairness is crucial in machine learning to prevent bias based on sensitive attributes in classifier predictions. However, the pursuit of strict fairness often sacrifices accuracy, particularly when significant prevalence disparities exist among groups, making classifiers less practical. For example, Alzheimer's disease (AD) is more prevalent in women than men, making equal treatment inequitable for females. Accounting for prevalence ratios among groups is essential for fair decision-making. In this paper, we introduce prior knowledge for fairness, which incorporates prevalence ratio information into the fairness constraint within the Empirical Risk Minimization (ERM) framework. We develop the Prior-knowledge-guided Fair ERM (PFERM) framework, aiming to minimize expected risk within a specified function class while adhering to a prior-knowledge-guided fairness constraint. This approach strikes a flexible balance between accuracy and fairness. Empirical results confirm its effectiveness in preserving fairness without compromising accuracy.

在机器学习中,公平性对于防止分类器预测中基于敏感属性的偏差至关重要。然而,追求严格的公平性往往会牺牲准确性,尤其是当群体间存在显著的患病率差异时,分类器的实用性就会大打折扣。例如,阿尔茨海默病(AD)在女性中的发病率高于男性,因此平等对待女性是不公平的。考虑群体间的患病率比率对于公平决策至关重要。在本文中,我们引入了公平性先验知识,将患病率信息纳入经验风险最小化(ERM)框架的公平性约束中。我们开发了先验知识指导的公平 ERM(PFERM)框架,旨在最小化指定函数类别内的预期风险,同时遵守先验知识指导的公平性约束。这种方法在准确性和公平性之间取得了灵活的平衡。实证结果证实了它在保持公平性的同时不影响准确性的有效性。
{"title":"PFERM: A Fair Empirical Risk Minimization Approach with Prior Knowledge.","authors":"Bojian Hou, Andrés Mondragón, Davoud Ataee Tarzanagh, Zhuoping Zhou, Andrew J Saykin, Jason H Moore, Marylyn D Ritchie, Qi Long, Li Shen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Fairness is crucial in machine learning to prevent bias based on sensitive attributes in classifier predictions. However, the pursuit of strict fairness often sacrifices accuracy, particularly when significant prevalence disparities exist among groups, making classifiers less practical. For example, Alzheimer's disease (AD) is more prevalent in women than men, making equal treatment inequitable for females. Accounting for prevalence ratios among groups is essential for fair decision-making. In this paper, we introduce prior knowledge for fairness, which incorporates prevalence ratio information into the fairness constraint within the Empirical Risk Minimization (ERM) framework. We develop the Prior-knowledge-guided Fair ERM (PFERM) framework, aiming to minimize expected risk within a specified function class while adhering to a prior-knowledge-guided fairness constraint. This approach strikes a flexible balance between accuracy and fairness. Empirical results confirm its effectiveness in preserving fairness without compromising accuracy.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"211-220"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141835/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated HIV Case Identification from the MIMIC-IV Database. 从 MIMIC-IV 数据库自动识别艾滋病病例。
Kai Jiang, Tru Cao

Automatic HIV phenotyping is needed for HIV research based on electronic health records (EHRs). MIMIC-IV, an extension of MIMIC-III, contains more than 520,000 hospital admissions and has become a valuable EHR database for secondary medical research. However, there was no prior phenotyping algorithm to extract HIV cases from MIMIC-IV, which requires a comprehensive knowledge of the database. Moreover, previous HIV phenotyping algorithms did not consider the new HIV-1/HIV-2 antibody differentiation immunoassay tests that MIMIC-IV contains. Our work provided insight into the structure and data elements in MIMIC-IV and proposed a new HIV phenotyping algorithm to fill in these gaps. The results included MIMIC-IV's data tables and elements used, 1,781 and 1,843 HIV cases from MIMIC-IV's versions 0.4 and 2.1, respectively, and summary statistics of these two HIV case cohorts. They could be used for the development of statistical and machine learning models in future studies about the disease.

基于电子健康记录(EHR)的 HIV 研究需要自动进行 HIV 表型分析。MIMIC-IV 是 MIMIC-III 的延伸,包含 52 万多个住院病例,已成为二次医学研究的重要电子病历数据库。然而,以前没有表型算法从 MIMIC-IV 中提取 HIV 病例,这需要对数据库有全面的了解。此外,以前的 HIV 表型分析算法没有考虑到 MIMIC-IV 所包含的新 HIV-1/HIV-2 抗体分化免疫测定。我们的研究深入了解了 MIMIC-IV 的结构和数据元素,并提出了一种新的 HIV 表型分析算法来填补这些空白。研究结果包括 MIMIC-IV 的数据表和所使用的元素、MIMIC-IV 0.4 和 2.1 版本中分别包含的 1,781 和 1,843 个 HIV 病例,以及这两个 HIV 病例队列的汇总统计数据。这些数据可用于在今后的疾病研究中开发统计和机器学习模型。
{"title":"Automated HIV Case Identification from the MIMIC-IV Database.","authors":"Kai Jiang, Tru Cao","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Automatic HIV phenotyping is needed for HIV research based on electronic health records (EHRs). MIMIC-IV, an extension of MIMIC-III, contains more than 520,000 hospital admissions and has become a valuable EHR database for secondary medical research. However, there was no prior phenotyping algorithm to extract HIV cases from MIMIC-IV, which requires a comprehensive knowledge of the database. Moreover, previous HIV phenotyping algorithms did not consider the new HIV-1/HIV-2 antibody differentiation immunoassay tests that MIMIC-IV contains. Our work provided insight into the structure and data elements in MIMIC-IV and proposed a new HIV phenotyping algorithm to fill in these gaps. The results included MIMIC-IV's data tables and elements used, 1,781 and 1,843 HIV cases from MIMIC-IV's versions 0.4 and 2.1, respectively, and summary statistics of these two HIV case cohorts. They could be used for the development of statistical and machine learning models in future studies about the disease.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2024 ","pages":"555-564"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141847/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1