首页 > 最新文献

Proceedings of the ACM Conference on Health, Inference, and Learning最新文献

英文 中文
Hurtful words: quantifying biases in clinical contextual word embeddings 伤人词:临床语境词嵌入中的量化偏差
Pub Date : 2020-03-11 DOI: 10.1145/3368555.3384448
H. Zhang, Amy X. Lu, Mohamed Abdalla, Matthew B. A. McDermott, M. Ghassemi
In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks. We pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset, and quantify potential disparities using two approaches. First, we identify dangerous latent relationships that are captured by the contextual word embeddings using a fill-in-the-blank method with text from real clinical notes and a log probability bias score quantification. Second, we evaluate performance gaps across different definitions of fairness on over 50 downstream clinical prediction tasks that include detection of acute and chronic conditions. We find that classifiers trained from BERT representations exhibit statistically significant differences in performance, often favoring the majority group with regards to gender, language, ethnicity, and insurance status. Finally, we explore shortcomings of using adversarial debiasing to obfuscate subgroup information in contextual word embeddings, and recommend best practices for such deep embedding models in clinical settings.
在这项工作中,我们研究了嵌入在多大程度上可能对边缘人群进行不同的编码,以及这可能如何导致偏见的延续和临床任务表现的恶化。我们在MIMIC-III医院数据集的医疗记录上预训练深度嵌入模型(BERT),并使用两种方法量化潜在的差异。首先,我们使用来自真实临床记录的文本和对数概率偏差评分量化的填空方法,识别由上下文词嵌入捕获的危险潜在关系。其次,我们评估了50多个下游临床预测任务中不同公平性定义的绩效差距,包括急性和慢性疾病的检测。我们发现,从BERT表示中训练出来的分类器在性能上表现出统计学上显著的差异,通常在性别、语言、种族和保险状况方面倾向于大多数群体。最后,我们探讨了在上下文词嵌入中使用对抗性去偏见来混淆子组信息的缺点,并推荐了在临床环境中使用这种深度嵌入模型的最佳实践。
{"title":"Hurtful words: quantifying biases in clinical contextual word embeddings","authors":"H. Zhang, Amy X. Lu, Mohamed Abdalla, Matthew B. A. McDermott, M. Ghassemi","doi":"10.1145/3368555.3384448","DOIUrl":"https://doi.org/10.1145/3368555.3384448","url":null,"abstract":"In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks. We pretrain deep embedding models (BERT) on medical notes from the MIMIC-III hospital dataset, and quantify potential disparities using two approaches. First, we identify dangerous latent relationships that are captured by the contextual word embeddings using a fill-in-the-blank method with text from real clinical notes and a log probability bias score quantification. Second, we evaluate performance gaps across different definitions of fairness on over 50 downstream clinical prediction tasks that include detection of acute and chronic conditions. We find that classifiers trained from BERT representations exhibit statistically significant differences in performance, often favoring the majority group with regards to gender, language, ethnicity, and insurance status. Finally, we explore shortcomings of using adversarial debiasing to obfuscate subgroup information in contextual word embeddings, and recommend best practices for such deep embedding models in clinical settings.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73808405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 109
Survival cluster analysis 生存聚类分析
Pub Date : 2020-02-29 DOI: 10.1145/3368555.3384465
Paidamoyo Chapfuwa, Chunyuan Li, Nikhil Mehta, L. Carin, Ricardo Henao
Conventional survival analysis approaches estimate risk scores or individualized time-to-event distributions conditioned on covariates. In practice, there is often great population-level phenotypic heterogeneity, resulting from (unknown) subpopulations with diverse risk profiles or survival distributions. As a result, there is an unmet need in survival analysis for identifying subpopulations with distinct risk profiles, while jointly accounting for accurate individualized time-to-event predictions. An approach that addresses this need is likely to improve the characterization of individual outcomes by leveraging regularities in subpopulations, thus accounting for population-level heterogeneity. In this paper, we propose a Bayesian nonparametrics approach that represents observations (subjects) in a clustered latent space, and encourages accurate time-to-event predictions and clusters (subpopulations) with distinct risk profiles. Experiments on real-world datasets show consistent improvements in predictive performance and interpretability relative to existing state-of-the-art survival analysis models.
传统的生存分析方法估计风险评分或个体化的事件时间分布,这些分布取决于协变量。在实践中,由于(未知的)亚种群具有不同的风险概况或生存分布,通常存在很大的种群水平表型异质性。因此,在生存分析中,识别具有不同风险概况的亚群,同时共同考虑准确的个性化事件时间预测,这是一个未满足的需求。解决这一需求的方法可能通过利用亚群体的规律性来改善个体结果的表征,从而考虑到群体水平的异质性。在本文中,我们提出了一种贝叶斯非参数方法,该方法表示聚类潜在空间中的观察(受试者),并鼓励准确的时间到事件预测和具有不同风险概况的聚类(亚种群)。在真实世界数据集上的实验表明,相对于现有的最先进的生存分析模型,预测性能和可解释性得到了持续的改进。
{"title":"Survival cluster analysis","authors":"Paidamoyo Chapfuwa, Chunyuan Li, Nikhil Mehta, L. Carin, Ricardo Henao","doi":"10.1145/3368555.3384465","DOIUrl":"https://doi.org/10.1145/3368555.3384465","url":null,"abstract":"Conventional survival analysis approaches estimate risk scores or individualized time-to-event distributions conditioned on covariates. In practice, there is often great population-level phenotypic heterogeneity, resulting from (unknown) subpopulations with diverse risk profiles or survival distributions. As a result, there is an unmet need in survival analysis for identifying subpopulations with distinct risk profiles, while jointly accounting for accurate individualized time-to-event predictions. An approach that addresses this need is likely to improve the characterization of individual outcomes by leveraging regularities in subpopulations, thus accounting for population-level heterogeneity. In this paper, we propose a Bayesian nonparametrics approach that represents observations (subjects) in a clustered latent space, and encourages accurate time-to-event predictions and clusters (subpopulations) with distinct risk profiles. Experiments on real-world datasets show consistent improvements in predictive performance and interpretability relative to existing state-of-the-art survival analysis models.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83583192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Generative ODE modeling with known unknowns 已知未知的生成ODE建模
Pub Date : 2020-02-26 DOI: 10.1145/3450439.3451866
Ori Linial, D. Eytan, Uri Shalit
In several crucial applications, domain knowledge is encoded by a system of ordinary differential equations (ODE), often stemming from underlying physical and biological processes. A motivating example is intensive care unit patients: the dynamics of vital physiological functions, such as the cardiovascular system with its associated variables (heart rate, cardiac contractility and output and vascular resistance) can be approximately described by a known system of ODEs. Typically, some of the ODE variables are directly observed (heart rate and blood pressure for example) while some are unobserved (cardiac contractility, output and vascular resistance), and in addition many other variables are observed but not modeled by the ODE, for example body temperature. Importantly, the unobserved ODE variables are "known-unknowns": We know they exist and their functional dynamics, but cannot measure them directly, nor do we know the function tying them to all observed measurements. As is often the case in medicine, and specifically the cardiovascular system, estimating these known-unknowns is highly valuable and they serve as targets for therapeutic manipulations. Under this scenario we wish to learn the parameters of the ODE generating each observed time-series, and extrapolate the future of the ODE variables and the observations. We address this task with a variational autoencoder incorporating the known ODE function, called GOKU-net1 for Generative ODE modeling with Known Unknowns. We first validate our method on videos of single and double pendulums with unknown length or mass; we then apply it to a model of the cardiovascular system. We show that modeling the known-unknowns allows us to successfully discover clinically meaningful unobserved system parameters, leads to much better extrapolation, and enables learning using much smaller training sets.
在一些关键的应用中,领域知识是由常微分方程(ODE)系统编码的,通常源于潜在的物理和生物过程。一个激励的例子是重症监护病房的病人:重要生理功能的动态,如心血管系统及其相关变量(心率、心脏收缩力和输出量以及血管阻力)可以用已知的ode系统近似地描述。通常,一些ODE变量是直接观察到的(例如心率和血压),而一些是未观察到的(心脏收缩力、输出量和血管阻力),此外还有许多其他变量是观察到的,但不是由ODE建模的,例如体温。重要的是,未观测到的ODE变量是“已知-未知的”:我们知道它们的存在和它们的功能动态,但不能直接测量它们,也不知道将它们与所有观测到的测量联系起来的函数。正如在医学,特别是心血管系统中经常出现的情况一样,估计这些已知的未知因素是非常有价值的,它们可以作为治疗操作的目标。在这种情况下,我们希望了解生成每个观测时间序列的ODE的参数,并推断ODE变量和观测值的未来。我们使用一个包含已知ODE函数的变分自编码器来解决这个问题,称为GOKU-net1,用于具有已知未知数的生成ODE建模。我们首先在长度或质量未知的单摆和双摆视频上验证了我们的方法;然后我们将其应用于心血管系统模型。我们表明,对已知-未知的建模使我们能够成功地发现临床有意义的未观察到的系统参数,导致更好的外推,并使学习使用更小的训练集。
{"title":"Generative ODE modeling with known unknowns","authors":"Ori Linial, D. Eytan, Uri Shalit","doi":"10.1145/3450439.3451866","DOIUrl":"https://doi.org/10.1145/3450439.3451866","url":null,"abstract":"In several crucial applications, domain knowledge is encoded by a system of ordinary differential equations (ODE), often stemming from underlying physical and biological processes. A motivating example is intensive care unit patients: the dynamics of vital physiological functions, such as the cardiovascular system with its associated variables (heart rate, cardiac contractility and output and vascular resistance) can be approximately described by a known system of ODEs. Typically, some of the ODE variables are directly observed (heart rate and blood pressure for example) while some are unobserved (cardiac contractility, output and vascular resistance), and in addition many other variables are observed but not modeled by the ODE, for example body temperature. Importantly, the unobserved ODE variables are \"known-unknowns\": We know they exist and their functional dynamics, but cannot measure them directly, nor do we know the function tying them to all observed measurements. As is often the case in medicine, and specifically the cardiovascular system, estimating these known-unknowns is highly valuable and they serve as targets for therapeutic manipulations. Under this scenario we wish to learn the parameters of the ODE generating each observed time-series, and extrapolate the future of the ODE variables and the observations. We address this task with a variational autoencoder incorporating the known ODE function, called GOKU-net1 for Generative ODE modeling with Known Unknowns. We first validate our method on videos of single and double pendulums with unknown length or mass; we then apply it to a model of the cardiovascular system. We show that modeling the known-unknowns allows us to successfully discover clinically meaningful unobserved system parameters, leads to much better extrapolation, and enables learning using much smaller training sets.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89526901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Disease state prediction from single-cell data using graph attention networks 利用图注意力网络从单细胞数据预测疾病状态
Pub Date : 2020-02-14 DOI: 10.1145/3368555.3384449
N. Ravindra, Arijit Sehanobish, Jenna L. Pappalardo, D. Hafler, D. V. Dijk
Single-cell RNA sequencing (scRNA-seq) has revolutionized bio-logical discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into health and disease, it has not been used for disease prediction or diagnostics. Graph Attention Networks have proven to be versatile for a wide range of tasks by learning from both original features and graph structures. Here we present a graph attention model for predicting disease state from single-cell data on a large dataset of Multiple Sclerosis (MS) patients. MS is a disease of the central nervous system that is difficult to diagnose. We train our model on single-cell data obtained from blood and cerebrospinal fluid (CSF) for a cohort of seven MS patients and six healthy adults (HA), resulting in 66,667 individual cells. We achieve 92% accuracy in predicting MS, outperforming other state-of-the-art methods such as a graph convolutional network, random forest, and multi-layer perceptron. Further, we use the learned graph attention model to get insight into the features (cell types and genes) that are important for this prediction. The graph attention model also allow us to infer a new feature space for the cells that emphasizes the difference between the two conditions. Finally we use the attention weights to learn a new low-dimensional embedding which we visualize with PHATE and UMAP. To the best of our knowledge, this is the first effort to use graph attention, and deep learning in general, to predict disease state from single-cell data. We envision applying this method to single-cell data for other diseases.
单细胞RNA测序(scRNA-seq)已经彻底改变了生物学发现,提供了组织中细胞异质性的无偏图片。虽然scRNA-seq已广泛用于了解健康和疾病,但尚未用于疾病预测或诊断。通过学习原始特征和图结构,图注意力网络已经被证明是广泛的任务。在这里,我们提出了一个从单细胞数据预测多发性硬化症(MS)患者大数据集的疾病状态的图注意力模型。多发性硬化症是一种难以诊断的中枢神经系统疾病。我们使用从7名MS患者和6名健康成人(HA)的血液和脑脊液(CSF)中获得的单细胞数据来训练我们的模型,得到66,667个单个细胞。我们在预测MS方面达到了92%的准确率,优于其他最先进的方法,如图卷积网络、随机森林和多层感知器。此外,我们使用学习图注意力模型来深入了解对这种预测很重要的特征(细胞类型和基因)。图注意模型还允许我们推断出强调两种情况之间差异的细胞的新特征空间。最后利用注意权值学习新的低维嵌入,并利用PHATE和UMAP将其可视化。据我们所知,这是第一次尝试使用图形注意力,以及一般的深度学习,从单细胞数据中预测疾病状态。我们设想将这种方法应用于其他疾病的单细胞数据。
{"title":"Disease state prediction from single-cell data using graph attention networks","authors":"N. Ravindra, Arijit Sehanobish, Jenna L. Pappalardo, D. Hafler, D. V. Dijk","doi":"10.1145/3368555.3384449","DOIUrl":"https://doi.org/10.1145/3368555.3384449","url":null,"abstract":"Single-cell RNA sequencing (scRNA-seq) has revolutionized bio-logical discovery, providing an unbiased picture of cellular heterogeneity in tissues. While scRNA-seq has been used extensively to provide insight into health and disease, it has not been used for disease prediction or diagnostics. Graph Attention Networks have proven to be versatile for a wide range of tasks by learning from both original features and graph structures. Here we present a graph attention model for predicting disease state from single-cell data on a large dataset of Multiple Sclerosis (MS) patients. MS is a disease of the central nervous system that is difficult to diagnose. We train our model on single-cell data obtained from blood and cerebrospinal fluid (CSF) for a cohort of seven MS patients and six healthy adults (HA), resulting in 66,667 individual cells. We achieve 92% accuracy in predicting MS, outperforming other state-of-the-art methods such as a graph convolutional network, random forest, and multi-layer perceptron. Further, we use the learned graph attention model to get insight into the features (cell types and genes) that are important for this prediction. The graph attention model also allow us to infer a new feature space for the cells that emphasizes the difference between the two conditions. Finally we use the attention weights to learn a new low-dimensional embedding which we visualize with PHATE and UMAP. To the best of our knowledge, this is the first effort to use graph attention, and deep learning in general, to predict disease state from single-cell data. We envision applying this method to single-cell data for other diseases.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79660898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
An adversarial approach for the robust classification of pneumonia from chest radiographs 从胸片对肺炎进行稳健分类的一种对抗性方法
Pub Date : 2020-01-13 DOI: 10.1145/3368555.3384458
Joseph D. Janizek, G. Erion, A. DeGrave, Su-In Lee
While deep learning has shown promise in the domain of disease classification from medical images, models based on state-of-the-art convolutional neural network architectures often exhibit performance loss due to dataset shift. Models trained using data from one hospital system achieve high predictive performance when tested on data from the same hospital, but perform significantly worse when they are tested in different hospital systems. Furthermore, even within a given hospital system, deep learning models have been shown to depend on hospital- and patient-level confounders rather than meaningful pathology to make classifications. In order for these models to be safely deployed, we would like to ensure that they do not use confounding variables to make their classification, and that they will work well even when tested on images from hospitals that were not included in the training data. We attempt to address this problem in the context of pneumonia classification from chest radiographs. We propose an approach based on adversarial optimization, which allows us to learn more robust models that do not depend on confounders. Specifically, we demonstrate improved out-of-hospital generalization performance of a pneumonia classifier by training a model that is invariant to the view position of chest radiographs (anterior-posterior vs. posterior-anterior). Our approach leads to better predictive performance on external hospital data than both a standard baseline and previously proposed methods to handle confounding, and also suggests a method for identifying models that may rely on confounders.
虽然深度学习在医学图像疾病分类领域显示出前景,但基于最先进的卷积神经网络架构的模型经常由于数据集移位而表现出性能损失。使用来自一家医院系统的数据进行训练的模型在对来自同一家医院的数据进行测试时获得了很高的预测性能,但在不同医院系统中进行测试时表现明显较差。此外,即使在给定的医院系统中,深度学习模型也被证明依赖于医院和患者层面的混杂因素,而不是有意义的病理学来进行分类。为了安全地部署这些模型,我们希望确保它们不使用混杂变量进行分类,并且即使在对未包含在训练数据中的医院图像进行测试时,它们也能很好地工作。我们试图在胸片肺炎分类的背景下解决这个问题。我们提出了一种基于对抗性优化的方法,这使我们能够学习不依赖于混杂因素的更健壮的模型。具体来说,我们通过训练一个模型来证明肺炎分类器在院外的泛化性能得到了改善,该模型对胸片的视图位置(前后vs后前)是不变的。我们的方法对外部医院数据的预测性能优于标准基线和先前提出的处理混杂因素的方法,并且还提出了一种识别可能依赖于混杂因素的模型的方法。
{"title":"An adversarial approach for the robust classification of pneumonia from chest radiographs","authors":"Joseph D. Janizek, G. Erion, A. DeGrave, Su-In Lee","doi":"10.1145/3368555.3384458","DOIUrl":"https://doi.org/10.1145/3368555.3384458","url":null,"abstract":"While deep learning has shown promise in the domain of disease classification from medical images, models based on state-of-the-art convolutional neural network architectures often exhibit performance loss due to dataset shift. Models trained using data from one hospital system achieve high predictive performance when tested on data from the same hospital, but perform significantly worse when they are tested in different hospital systems. Furthermore, even within a given hospital system, deep learning models have been shown to depend on hospital- and patient-level confounders rather than meaningful pathology to make classifications. In order for these models to be safely deployed, we would like to ensure that they do not use confounding variables to make their classification, and that they will work well even when tested on images from hospitals that were not included in the training data. We attempt to address this problem in the context of pneumonia classification from chest radiographs. We propose an approach based on adversarial optimization, which allows us to learn more robust models that do not depend on confounders. Specifically, we demonstrate improved out-of-hospital generalization performance of a pneumonia classifier by training a model that is invariant to the view position of chest radiographs (anterior-posterior vs. posterior-anterior). Our approach leads to better predictive performance on external hospital data than both a standard baseline and previously proposed methods to handle confounding, and also suggests a method for identifying models that may rely on confounders.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81387282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Variationally regularized graph-based representation learning for electronic health records 基于变分正则化图的电子健康记录表示学习
Pub Date : 2019-12-08 DOI: 10.1145/3450439.3451855
Weicheng Zhu, N. Razavian
Electronic Health Records (EHR) are high-dimensional data with implicit connections among thousands of medical concepts. These connections, for instance, the co-occurrence of diseases and lab-disease correlations can be informative when only a subset of these variables is documented by the clinician. A feasible approach to improving the representation learning of EHR data is to associate relevant medical concepts and utilize these connections. Existing medical ontologies can be the reference for EHR structures, but they place numerous constraints on the data source. Recent progress on graph neural networks (GNN) enables end-to-end learning of topological structures for non-grid or non-sequential data. However, there are problems to be addressed on how to learn the medical graph adaptively and how to understand the effect of medical graph on representation learning. In this paper, we propose a variationally regularized encoder-decoder graph network that achieves more robustness in graph structure learning by regularizing node representations. Our model outperforms the existing graph and non-graph based methods in various EHR predictive tasks based on both public data and real-world clinical data. Besides the improvements in empirical experiment performances, we provide an interpretation of the effect of variational regularization compared to standard graph neural network, using singular value analysis.
电子健康记录(EHR)是高维数据,在数千个医疗概念之间具有隐式联系。例如,当临床医生只记录这些变量的一个子集时,这些联系,疾病的共发生和实验室疾病的相关性可以提供信息。将相关的医学概念关联起来并利用这些联系是改善电子病历数据表示学习的一种可行方法。现有的医学本体可以作为EHR结构的参考,但是它们对数据源施加了许多限制。图神经网络(GNN)的最新进展使非网格或非顺序数据的拓扑结构的端到端学习成为可能。然而,如何自适应地学习医学图,以及如何理解医学图对表征学习的影响,都是有待解决的问题。在本文中,我们提出了一种变正则化的编码器-解码器图网络,该网络通过正则化节点表示来实现图结构学习的鲁棒性。我们的模型在基于公共数据和真实临床数据的各种EHR预测任务中优于现有的基于图和非基于图的方法。除了经验实验性能的改进之外,我们还提供了与标准图神经网络相比,使用奇异值分析的变分正则化效果的解释。
{"title":"Variationally regularized graph-based representation learning for electronic health records","authors":"Weicheng Zhu, N. Razavian","doi":"10.1145/3450439.3451855","DOIUrl":"https://doi.org/10.1145/3450439.3451855","url":null,"abstract":"Electronic Health Records (EHR) are high-dimensional data with implicit connections among thousands of medical concepts. These connections, for instance, the co-occurrence of diseases and lab-disease correlations can be informative when only a subset of these variables is documented by the clinician. A feasible approach to improving the representation learning of EHR data is to associate relevant medical concepts and utilize these connections. Existing medical ontologies can be the reference for EHR structures, but they place numerous constraints on the data source. Recent progress on graph neural networks (GNN) enables end-to-end learning of topological structures for non-grid or non-sequential data. However, there are problems to be addressed on how to learn the medical graph adaptively and how to understand the effect of medical graph on representation learning. In this paper, we propose a variationally regularized encoder-decoder graph network that achieves more robustness in graph structure learning by regularizing node representations. Our model outperforms the existing graph and non-graph based methods in various EHR predictive tasks based on both public data and real-world clinical data. Besides the improvements in empirical experiment performances, we provide an interpretation of the effect of variational regularization compared to standard graph neural network, using singular value analysis.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76835562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Population-aware hierarchical bayesian domain adaptation via multi-component invariant learning 基于多分量不变学习的种群感知层次贝叶斯域自适应
Pub Date : 2019-08-25 DOI: 10.1145/3368555.3384451
V. Mhasawade, N. Rehman, R. Chunara
While machine learning is rapidly being developed and deployed in health settings such as influenza prediction, there are critical challenges in using data from one environment to predict in another due to variability in features. Even within disease labels there can be differences (e.g. "fever" may mean something different reported in a doctor's office versus in an online app). Moreover, models are often built on passive, observational data which contain different distributions of population subgroups (e.g. men or women). Thus, there are two forms of instability between environments in this observational transport problem. We first harness substantive knowledge from health research to conceptualize the underlying causal structure of this problem in a health outcome prediction task. Based on sources of stability in the model and the task, we posit that we can combine environment and population information in a novel population-aware hierarchical Bayesian domain adaptation framework that harnesses multiple invariant components through population attributes when needed. We study the conditions under which invariant learning fails, leading to reliance on the environment-specific attributes. Experimental results for an influenza prediction task on four datasets gathered from different contexts show the model can improve prediction in the case of largely unlabelled target data from a new environment and different constituent population, by harnessing both environment and population invariant information. This work represents a novel, principled way to address a critical challenge by blending domain (health) knowledge and algorithmic innovation. The proposed approach will have significant impact in many social settings wherein who the data comes from and how it was generated, matters.
虽然机器学习正在迅速发展并应用于流感预测等卫生环境,但由于特征的可变性,在使用一种环境中的数据来预测另一种环境中存在重大挑战。即使在疾病标签内也可能存在差异(例如:“发烧”在医生办公室和在线应用中可能有不同的含义)。此外,模型往往建立在被动的观测数据上,这些数据包含人口亚组(例如男性或女性)的不同分布。因此,在观测输运问题中,存在两种不同环境之间的不稳定性。我们首先利用来自健康研究的实质性知识,在健康结果预测任务中概念化这个问题的潜在因果结构。基于模型和任务的稳定性来源,我们假设我们可以将环境和种群信息结合在一个新的种群感知层次贝叶斯域自适应框架中,该框架在需要时通过种群属性利用多个不变成分。我们研究了不变量学习失败的条件,导致依赖于特定环境的属性。对从不同环境中收集的四个数据集进行流感预测任务的实验结果表明,该模型可以通过利用环境和群体不变量信息来提高对来自新环境和不同组成群体的大量未标记目标数据的预测。这项工作代表了一种新颖的、原则性的方法,通过融合领域(健康)知识和算法创新来解决关键挑战。所提议的方法将在许多社会环境中产生重大影响,其中数据来自谁以及如何生成非常重要。
{"title":"Population-aware hierarchical bayesian domain adaptation via multi-component invariant learning","authors":"V. Mhasawade, N. Rehman, R. Chunara","doi":"10.1145/3368555.3384451","DOIUrl":"https://doi.org/10.1145/3368555.3384451","url":null,"abstract":"While machine learning is rapidly being developed and deployed in health settings such as influenza prediction, there are critical challenges in using data from one environment to predict in another due to variability in features. Even within disease labels there can be differences (e.g. \"fever\" may mean something different reported in a doctor's office versus in an online app). Moreover, models are often built on passive, observational data which contain different distributions of population subgroups (e.g. men or women). Thus, there are two forms of instability between environments in this observational transport problem. We first harness substantive knowledge from health research to conceptualize the underlying causal structure of this problem in a health outcome prediction task. Based on sources of stability in the model and the task, we posit that we can combine environment and population information in a novel population-aware hierarchical Bayesian domain adaptation framework that harnesses multiple invariant components through population attributes when needed. We study the conditions under which invariant learning fails, leading to reliance on the environment-specific attributes. Experimental results for an influenza prediction task on four datasets gathered from different contexts show the model can improve prediction in the case of largely unlabelled target data from a new environment and different constituent population, by harnessing both environment and population invariant information. This work represents a novel, principled way to address a critical challenge by blending domain (health) knowledge and algorithmic innovation. The proposed approach will have significant impact in many social settings wherein who the data comes from and how it was generated, matters.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73829768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
MIMIC-Extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III MIMIC-Extract:用于MIMIC-III的数据提取、预处理和表示管道
Pub Date : 2019-07-19 DOI: 10.1145/3368555.3384469
Shirly Wang, Matthew B. A. McDermott, Geeticka Chauhan, Michael C. Hughes, Tristan Naumann, M. Ghassemi
Machine learning for healthcare researchers face challenges to progress and reproducibility due to a lack of standardized processing frameworks for public datasets. We present MIMIC-Extract, an open source pipeline for transforming the raw electronic health record (EHR) data of critical care patients from the publicly-available MIMIC-III database into data structures that are directly usable in common time-series prediction pipelines. MIMIC-Extract addresses three challenges in making complex EHR data accessible to the broader machine learning community. First, MIMIC-Extract transforms raw vital sign and laboratory measurements into usable hourly time series, performing essential steps such as unit conversion, outlier handling, and aggregation of semantically similar features to reduce missingness and improve robustness. Second, MIMIC-Extract extracts and makes prediction of clinically-relevant targets possible, including outcomes such as mortality and length-of-stay as well as comprehensive hourly intervention signals for ventilators, vasopressors, and fluid therapies. Finally, the pipeline emphasizes reproducibility and extensibility to future research questions. We demonstrate the pipeline's effectiveness by developing several benchmark tasks for outcome and intervention forecasting and assessing the performance of competitive models.
由于缺乏公共数据集的标准化处理框架,医疗保健研究人员的机器学习面临着进展和可重复性的挑战。我们提出了MIMIC-Extract,这是一个开源管道,用于将危重病患者的原始电子健康记录(EHR)数据从公开可用的MIMIC-III数据库转换为可直接用于通用时间序列预测管道的数据结构。MIMIC-Extract解决了将复杂的电子病历数据提供给更广泛的机器学习社区的三个挑战。首先,mimi - extract将原始生命体征和实验室测量值转换为可用的小时时间序列,执行基本步骤,如单位转换、异常值处理和语义相似特征的聚合,以减少缺失并提高鲁棒性。其次,MIMIC-Extract可以提取并预测临床相关目标,包括死亡率和住院时间等结果,以及呼吸机、血管加压剂和液体治疗的综合每小时干预信号。最后,管道强调可重复性和可扩展性,以解决未来的研究问题。我们通过开发结果和干预预测的几个基准任务以及评估竞争模型的性能来证明管道的有效性。
{"title":"MIMIC-Extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III","authors":"Shirly Wang, Matthew B. A. McDermott, Geeticka Chauhan, Michael C. Hughes, Tristan Naumann, M. Ghassemi","doi":"10.1145/3368555.3384469","DOIUrl":"https://doi.org/10.1145/3368555.3384469","url":null,"abstract":"Machine learning for healthcare researchers face challenges to progress and reproducibility due to a lack of standardized processing frameworks for public datasets. We present MIMIC-Extract, an open source pipeline for transforming the raw electronic health record (EHR) data of critical care patients from the publicly-available MIMIC-III database into data structures that are directly usable in common time-series prediction pipelines. MIMIC-Extract addresses three challenges in making complex EHR data accessible to the broader machine learning community. First, MIMIC-Extract transforms raw vital sign and laboratory measurements into usable hourly time series, performing essential steps such as unit conversion, outlier handling, and aggregation of semantically similar features to reduce missingness and improve robustness. Second, MIMIC-Extract extracts and makes prediction of clinically-relevant targets possible, including outcomes such as mortality and length-of-stay as well as comprehensive hourly intervention signals for ventilators, vasopressors, and fluid therapies. Finally, the pipeline emphasizes reproducibility and extensibility to future research questions. We demonstrate the pipeline's effectiveness by developing several benchmark tasks for outcome and intervention forecasting and assessing the performance of competitive models.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79677379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 126
Explaining an increase in predicted risk for clinical alerts 解释了临床预警预测风险的增加
Pub Date : 2019-07-10 DOI: 10.1145/3368555.3384460
Michaela Hardt, A. Rajkomar, Gerardo Flores, Andrew M. Dai, M. Howell, Greg S. Corrado, Claire Cui, Moritz Hardt
Much work aims to explain a model's prediction on a static input. We consider explanations in a temporal setting where a stateful dynamical model produces a sequence of risk estimates given an input at each time step. When the estimated risk increases, the goal of the explanation is to attribute the increase to a few relevant inputs from the past. While our formal setup and techniques are general, we carry out an in-depth case study in a clinical setting. The goal here is to alert a clinician when a patient's risk of deterioration rises. The clinician then has to decide whether to intervene and adjust the treatment. Given a potentially long sequence of new events since she last saw the patient, a concise explanation helps her to quickly triage the alert. We develop methods to lift static attribution techniques to the dynamical setting, where we identify and address challenges specific to dynamics. We then experimentally assess the utility of different explanations of clinical alerts through expert evaluation.
许多工作旨在解释模型对静态输入的预测。我们考虑在时间设置中的解释,其中有状态动态模型在每个时间步产生给定输入的风险估计序列。当估计的风险增加时,解释的目标是将增加归因于过去的一些相关输入。虽然我们的正式设置和技术是一般的,但我们在临床环境中进行深入的案例研究。这样做的目的是在病人病情恶化的风险上升时提醒临床医生。然后,临床医生必须决定是否进行干预和调整治疗。鉴于她上次见到病人后可能发生了一系列新事件,一个简明的解释有助于她快速对警报进行分类。我们开发了将静态归因技术提升到动态环境的方法,在动态环境中我们识别和解决特定的挑战。然后,我们通过专家评估实验评估临床警报的不同解释的效用。
{"title":"Explaining an increase in predicted risk for clinical alerts","authors":"Michaela Hardt, A. Rajkomar, Gerardo Flores, Andrew M. Dai, M. Howell, Greg S. Corrado, Claire Cui, Moritz Hardt","doi":"10.1145/3368555.3384460","DOIUrl":"https://doi.org/10.1145/3368555.3384460","url":null,"abstract":"Much work aims to explain a model's prediction on a static input. We consider explanations in a temporal setting where a stateful dynamical model produces a sequence of risk estimates given an input at each time step. When the estimated risk increases, the goal of the explanation is to attribute the increase to a few relevant inputs from the past. While our formal setup and techniques are general, we carry out an in-depth case study in a clinical setting. The goal here is to alert a clinician when a patient's risk of deterioration rises. The clinician then has to decide whether to intervene and adjust the treatment. Given a potentially long sequence of new events since she last saw the patient, a concise explanation helps her to quickly triage the alert. We develop methods to lift static attribution techniques to the dynamical setting, where we identify and address challenges specific to dynamics. We then experimentally assess the utility of different explanations of clinical alerts through expert evaluation.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72762737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Analyzing the role of model uncertainty for electronic health records 分析模型不确定性在电子健康记录中的作用
Pub Date : 2019-06-10 DOI: 10.1145/3368555.3384457
Michael W. Dusenberry, Dustin Tran, E. Choi, Jonas Kemp, Jeremy Nixon, Ghassen Jerfel, K. Heller, Andrew M. Dai
In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertainty methods in the medical domain. Using RNN ensembles and various Bayesian RNNs, we show that population-level metrics, such as AUC-PR, AUC-ROC, log-likelihood, and calibration error, do not capture model uncertainty. Meanwhile, the presence of significant variability in patient-specific predictions and optimal decisions motivates the need for capturing model uncertainty. Understanding the uncertainty for individual patients is an area with clear clinical impact, such as determining when a model decision is likely to be brittle. We further show that RNNs with only Bayesian embeddings can be a more efficient way to capture model uncertainty compared to ensembles, and we analyze how model uncertainty is impacted across individual input features and patient subgroups.
在医学领域,错误预测的伦理和金钱成本可能是巨大的,而且问题的复杂性往往需要越来越复杂的模型。最近的研究表明,仅仅改变随机种子就足以使原本调谐良好的深度神经网络在各自的预测概率上发生变化。鉴于此,我们研究了模型不确定性方法在医学领域的作用。使用RNN集成和各种贝叶斯RNN,我们表明,总体水平指标,如AUC-PR、AUC-ROC、对数似然和校准误差,不能捕获模型的不确定性。同时,在特定患者的预测和最佳决策显著可变性的存在激发了捕获模型不确定性的需要。了解个体患者的不确定性是一个具有明显临床影响的领域,例如确定模型决策何时可能脆弱。我们进一步表明,与集成相比,仅使用贝叶斯嵌入的rnn可以更有效地捕获模型不确定性,并且我们分析了模型不确定性如何在单个输入特征和患者子组之间受到影响。
{"title":"Analyzing the role of model uncertainty for electronic health records","authors":"Michael W. Dusenberry, Dustin Tran, E. Choi, Jonas Kemp, Jeremy Nixon, Ghassen Jerfel, K. Heller, Andrew M. Dai","doi":"10.1145/3368555.3384457","DOIUrl":"https://doi.org/10.1145/3368555.3384457","url":null,"abstract":"In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertainty methods in the medical domain. Using RNN ensembles and various Bayesian RNNs, we show that population-level metrics, such as AUC-PR, AUC-ROC, log-likelihood, and calibration error, do not capture model uncertainty. Meanwhile, the presence of significant variability in patient-specific predictions and optimal decisions motivates the need for capturing model uncertainty. Understanding the uncertainty for individual patients is an area with clear clinical impact, such as determining when a model decision is likely to be brittle. We further show that RNNs with only Bayesian embeddings can be a more efficient way to capture model uncertainty compared to ensembles, and we analyze how model uncertainty is impacted across individual input features and patient subgroups.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81730399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 94
期刊
Proceedings of the ACM Conference on Health, Inference, and Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1