首页 > 最新文献

2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology最新文献

英文 中文
An Approach for Incorporating Context in Building Probabilistic Predictive Models 一种建立概率预测模型时结合情境的方法
J. Wu, William Hsu, A. Bui
With the increasing amount of information collected through clinical practice and scientific experimentation, a growing challenge is how to utilize available resources to construct predictive models to facilitate clinical decision making. Clinicians often have questions related to the treatment and outcome of a medical problem for individual patients; however, few tools exist that leverage the large collection of patient data and scientific knowledge to answer these questions. Without appropriate context, existing data that have been collected for a specific task may not be suitable for creating new models that answer different questions. This paper presents an approach that leverages available structured or unstructured data to build a probabilistic predictive model that assists physicians with answering clinical questions on individual patients. Various challenges related to transforming available data to an end-user application are addressed: problem decomposition, variable selection, context representation, automated extraction of information from unstructured data sources, model generation, and development of an intuitive application to query the model and present the results. We describe our efforts towards building a model that predicts the risk of vasospasm in aneurysm patients.
随着临床实践和科学实验收集的信息越来越多,如何利用现有资源构建预测模型以促进临床决策日益成为一个挑战。临床医生经常对个别患者的医疗问题的治疗和结果有疑问;然而,很少有工具可以利用大量的患者数据和科学知识来回答这些问题。如果没有适当的上下文,为特定任务收集的现有数据可能不适合创建回答不同问题的新模型。本文提出了一种方法,利用现有的结构化或非结构化数据来建立一个概率预测模型,帮助医生回答个别患者的临床问题。解决了与将可用数据转换为最终用户应用程序相关的各种挑战:问题分解、变量选择、上下文表示、从非结构化数据源自动提取信息、模型生成以及开发用于查询模型和显示结果的直观应用程序。我们描述了我们努力建立一个模型,预测动脉瘤患者血管痉挛的风险。
{"title":"An Approach for Incorporating Context in Building Probabilistic Predictive Models","authors":"J. Wu, William Hsu, A. Bui","doi":"10.1109/HISB.2012.30","DOIUrl":"https://doi.org/10.1109/HISB.2012.30","url":null,"abstract":"With the increasing amount of information collected through clinical practice and scientific experimentation, a growing challenge is how to utilize available resources to construct predictive models to facilitate clinical decision making. Clinicians often have questions related to the treatment and outcome of a medical problem for individual patients; however, few tools exist that leverage the large collection of patient data and scientific knowledge to answer these questions. Without appropriate context, existing data that have been collected for a specific task may not be suitable for creating new models that answer different questions. This paper presents an approach that leverages available structured or unstructured data to build a probabilistic predictive model that assists physicians with answering clinical questions on individual patients. Various challenges related to transforming available data to an end-user application are addressed: problem decomposition, variable selection, context representation, automated extraction of information from unstructured data sources, model generation, and development of an intuitive application to query the model and present the results. We describe our efforts towards building a model that predicts the risk of vasospasm in aneurysm patients.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"284 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131417730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Azathioprine-Induced Comorbidity Network Reveals Patterns and Predictors of Nephrotoxicity and Neutrophilia 硫唑嘌呤诱导的共病网络揭示了肾毒性和中性粒细胞增多的模式和预测因素
Vishal N. Patel, D. Kaelber
We sought to examine the frequencies and patterns of nephrotoxicity and neutrophilia due to azathioprine (AZA), and to develop a prototype method for using large de-identified electronic health record (EHR) data to aid in post-market drug surveillance. We leveraged a de-identified database of over 10 million patient EHRs to construct a network of comorbidities induced by administration of AZA, where comorbidities were defined by baseline-controlled laboratory values. To gauge the significance of the identified disease patterns, we calculated the relative risk of developing a comorbidity pair relative to a control cohort of patients taking one of 12 other anti-rheumatic agents. Nephrotoxicity as gauged by elevations in creatinine was present in 11% of patients taking AZA, and this frequency was significantly higher than in patients taking other anti-rheumatic agents (RR: 1.2, 95% CI: 1.04-1.43). Neutrophilia was highly prevalent (45%) in the population and was also unique to AZA (RR: 1.2, 95% CI: 1.17-1.28). Using a comorbidity network analysis, we hypothesized that the joint consideration of anemia (hemoglobin 190 IU/L) may serve as a predictor of impending renal dysfunction. Indeed, these two laboratory values provide approximately 100% sensitivity in predicting subsequent elevations in creatinine. Furthermore, the predictive power is unique to AZA, for jointly considering anemia and an elevated LDH provides only 50% sensitivity in predicting creatinine elevations with other anti-rheumatic agents. Our work demonstrates that the construction of comorbidity networks from de-identified EHR data sets can provide both sufficient insight and statistical power to uncover novel patterns and predictors of disease.
我们试图检查由硫唑嘌呤(AZA)引起的肾毒性和中性粒细胞增多的频率和模式,并开发一种原型方法,用于使用大型去识别电子健康记录(EHR)数据来帮助上市后药物监测。我们利用超过1000万患者电子病历的去识别数据库来构建一个由AZA引起的合并症网络,其中合并症由基线控制的实验室值定义。为了衡量确定的疾病模式的重要性,我们计算了相对于服用其他12种抗风湿药之一的对照队列患者发生共病对的相对风险。服用AZA的患者中有11%存在肾毒性(通过肌酐升高来衡量),这一频率显著高于服用其他抗风湿药物的患者(RR: 1.2, 95% CI: 1.04-1.43)。嗜中性粒细胞在人群中非常普遍(45%),也是AZA独有的(RR: 1.2, 95% CI: 1.17-1.28)。通过合并症网络分析,我们假设联合考虑贫血(血红蛋白190 IU/L)可以作为即将发生肾功能障碍的预测因子。事实上,这两个实验室值在预测随后的肌酐升高方面提供了大约100%的灵敏度。此外,AZA的预测能力是独一无二的,因为联合考虑贫血和LDH升高,与其他抗风湿药物相比,预测肌酐升高的灵敏度只有50%。我们的工作表明,从去识别的电子病历数据集构建共病网络可以提供足够的洞察力和统计能力,以揭示疾病的新模式和预测因素。
{"title":"Azathioprine-Induced Comorbidity Network Reveals Patterns and Predictors of Nephrotoxicity and Neutrophilia","authors":"Vishal N. Patel, D. Kaelber","doi":"10.1109/HISB.2012.28","DOIUrl":"https://doi.org/10.1109/HISB.2012.28","url":null,"abstract":"We sought to examine the frequencies and patterns of nephrotoxicity and neutrophilia due to azathioprine (AZA), and to develop a prototype method for using large de-identified electronic health record (EHR) data to aid in post-market drug surveillance. We leveraged a de-identified database of over 10 million patient EHRs to construct a network of comorbidities induced by administration of AZA, where comorbidities were defined by baseline-controlled laboratory values. To gauge the significance of the identified disease patterns, we calculated the relative risk of developing a comorbidity pair relative to a control cohort of patients taking one of 12 other anti-rheumatic agents. Nephrotoxicity as gauged by elevations in creatinine was present in 11% of patients taking AZA, and this frequency was significantly higher than in patients taking other anti-rheumatic agents (RR: 1.2, 95% CI: 1.04-1.43). Neutrophilia was highly prevalent (45%) in the population and was also unique to AZA (RR: 1.2, 95% CI: 1.17-1.28). Using a comorbidity network analysis, we hypothesized that the joint consideration of anemia (hemoglobin 190 IU/L) may serve as a predictor of impending renal dysfunction. Indeed, these two laboratory values provide approximately 100% sensitivity in predicting subsequent elevations in creatinine. Furthermore, the predictive power is unique to AZA, for jointly considering anemia and an elevated LDH provides only 50% sensitivity in predicting creatinine elevations with other anti-rheumatic agents. Our work demonstrates that the construction of comorbidity networks from de-identified EHR data sets can provide both sufficient insight and statistical power to uncover novel patterns and predictors of disease.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130963368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Study on Studies: Exploring the Metadata Associated with dbGaP Studies 关于研究的研究:dbGaP研究相关元数据的探索
Karen Truong, Mike Conway
The database of Genotypes and Phenotypes (dbGaP) was developed by the National Heart Lung, and Blood Institute (NHLBI) to archive genome-wide association studies (GWAS) data. As of July 17th 2012, dbGaP contained 305 top-level studies. The metadata for each study (available from the dbGaP website) are organized into distinct sections, including a study description, inclusion/exclusion criteria, policies for authorized access requests, MeSH terms, PubMed identifiers, study histories, and the names of principal and co-investigators. We here tabulate the salient characteristics of dbGaP metadata as part of the Phenotype Discoverer (PhD) project, a research project at the University of California San Diego Division of Biomedical Informatics which aims to enhance the "searchability" of the current dbGaP website through the alignment of phenotypes to a standard information model. In particular, we are interested in using the extracted metadata PubMed identifiers, principal investigator names, associated journal names, etc. as input to a statistical text.
基因型和表型数据库(dbGaP)是由国家心肺和血液研究所(NHLBI)开发的,用于存档全基因组关联研究(GWAS)数据。截至2012年7月17日,dbGaP共收录305项顶级研究。每个研究的元数据(可从dbGaP网站获得)被组织成不同的部分,包括研究描述、纳入/排除标准、授权访问请求策略、MeSH术语、PubMed标识符、研究历史以及主要和共同研究者的姓名。我们在此将dbGaP元数据的显著特征制成表格,作为表现型发现者(Phenotype Discoverer, PhD)项目的一部分。表现型发现者(Phenotype Discoverer, PhD)项目是加州大学圣地亚哥分校生物医学信息部的一个研究项目,旨在通过将表现型与标准信息模型相结合,增强当前dbGaP网站的“可搜索性”。特别是,我们对使用提取的元数据PubMed标识符、主要研究者姓名、相关期刊名称等作为统计文本的输入感兴趣。
{"title":"A Study on Studies: Exploring the Metadata Associated with dbGaP Studies","authors":"Karen Truong, Mike Conway","doi":"10.1109/HISB.2012.51","DOIUrl":"https://doi.org/10.1109/HISB.2012.51","url":null,"abstract":"The database of Genotypes and Phenotypes (dbGaP) was developed by the National Heart Lung, and Blood Institute (NHLBI) to archive genome-wide association studies (GWAS) data. As of July 17th 2012, dbGaP contained 305 top-level studies. The metadata for each study (available from the dbGaP website) are organized into distinct sections, including a study description, inclusion/exclusion criteria, policies for authorized access requests, MeSH terms, PubMed identifiers, study histories, and the names of principal and co-investigators. We here tabulate the salient characteristics of dbGaP metadata as part of the Phenotype Discoverer (PhD) project, a research project at the University of California San Diego Division of Biomedical Informatics which aims to enhance the \"searchability\" of the current dbGaP website through the alignment of phenotypes to a standard information model. In particular, we are interested in using the extracted metadata PubMed identifiers, principal investigator names, associated journal names, etc. as input to a statistical text.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132693178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Building an Ontology of Phenotypes for Existing GWAS Studies 为现有的GWAS研究建立表型本体
N. Alipanah, Hyeon-eui Kim, L. Ohno-Machado
The database of Genotypes and Phenotypes (dbGaP) is archiving the results of different Genome Wide Association Studies (GWAS). dbGaP has a multitude of phenotype variables, but they are not harmonized across studies. Unfortunately, dbGaP lacks semantic relations among its variables. This prevents efficient information retrieval and accurate searches to find studies that contain common phenotypes. Our goal is to standardize dbGaP information to allow accurate, reusable and quick retrieval of information.
基因型和表型数据库(dbGaP)正在归档不同的全基因组关联研究(GWAS)的结果。dbGaP有许多表型变量,但它们在研究中并不协调。不幸的是,dbGaP的变量之间缺乏语义关系。这阻碍了有效的信息检索和准确的搜索,以找到包含共同表型的研究。我们的目标是标准化dbGaP信息,以实现准确、可重用和快速的信息检索。
{"title":"Building an Ontology of Phenotypes for Existing GWAS Studies","authors":"N. Alipanah, Hyeon-eui Kim, L. Ohno-Machado","doi":"10.1109/HISB.2012.36","DOIUrl":"https://doi.org/10.1109/HISB.2012.36","url":null,"abstract":"The database of Genotypes and Phenotypes (dbGaP) is archiving the results of different Genome Wide Association Studies (GWAS). dbGaP has a multitude of phenotype variables, but they are not harmonized across studies. Unfortunately, dbGaP lacks semantic relations among its variables. This prevents efficient information retrieval and accurate searches to find studies that contain common phenotypes. Our goal is to standardize dbGaP information to allow accurate, reusable and quick retrieval of information.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134502820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora 找出药物!从超大临床语料库中提取药品名称的无监督模式匹配方法
A. Coden, D. Gruhl, Neal Lewis, M. Tanenblatt, J. Terdiman
Although structured electronic health records are becoming more prevalent, much information about patient health is still recorded only in unstructured text. “Understanding” these texts has been a focus of natural language processing (NLP) research for many years, with some remarkable successes, yet there is more work to be done. Knowing the drugs patients take is not only critical for understanding patient health (e.g., for drug-drug interactions or drug-enzyme interaction), but also for secondary uses, such as research on treatment effectiveness. Several drug dictionaries have been curated, such as RxNorm, FDA's Orange Book, or NCI, with a focus on prescription drugs. Developing these dictionaries is a challenge, but even more challenging is keeping these dictionaries up-to-date in the face of a rapidly advancing field-it is critical to identify grapefruit as a “drug” for a patient who takes the prescription medicine Lipitor, due to their known adverse interaction. To discover other, new adverse drug interactions, a large number of patient histories often need to be examined, necessitating not only accurate but also fast algorithms to identify pharmacological substances. In this paper we propose a new algorithm, SPOT, which identifies drug names that can be used as new dictionary entries from a large corpus, where a “drug” is defined as a substance intended for use in the diagnosis, cure, mitigation, treatment, or prevention of disease. Measured against a manually annotated reference corpus, we present precision and recall values for SPOT. SPOT is language and syntax independent, can be run efficiently to keep dictionaries up-to-date and to also suggest words and phrases which may be misspellings or uncatalogued synonyms of a known drug. We show how SPOT's lack of reliance on NLP tools makes it robust in analyzing clinical medical text. SPOT is a generalized bootstrapping algorithm, seeded with a known dictionary and automatically extracting the context within which each drug is mentioned. We define three features of such context: support, confidence and prevalence. Finally, we present the performance tradeoffs depending on the thresholds chosen for these features.
尽管结构化电子健康记录正变得越来越普遍,但关于患者健康的许多信息仍然仅以非结构化文本记录。多年来,“理解”这些文本一直是自然语言处理(NLP)研究的焦点,取得了一些显著的成功,但仍有更多的工作要做。了解患者服用的药物不仅对了解患者的健康状况至关重要(例如,药物-药物相互作用或药物-酶相互作用),而且对次要用途也至关重要,例如研究治疗效果。一些药物词典已经策划,如RxNorm, FDA的橙皮书,或NCI,重点是处方药。开发这些词典是一项挑战,但更具有挑战性的是,面对一个快速发展的领域,使这些词典保持最新——对于服用处方药立普妥的病人来说,确定葡萄柚是一种“药物”至关重要,因为它们已知的不良相互作用。为了发现其他新的药物不良相互作用,通常需要检查大量的患者病史,这不仅需要准确而且需要快速的算法来识别药理学物质。在本文中,我们提出了一种新的算法SPOT,它可以从大型语料库中识别可用作新词典条目的药物名称,其中“药物”被定义为用于诊断、治愈、缓解、治疗或预防疾病的物质。根据手动标注的参考语料库进行测量,我们给出了SPOT的精度和召回值。SPOT是独立于语言和语法的,可以有效地运行,以保持字典的最新,也可以建议单词和短语,可能是拼写错误或未编目的同义词的已知药物。我们展示了SPOT缺乏对NLP工具的依赖如何使其在分析临床医学文本方面变得强大。SPOT是一种广义的自举算法,以已知字典为种子,自动提取提及每种药物的上下文。我们定义了这种背景的三个特征:支持、信心和流行。最后,我们根据为这些特性选择的阈值给出了性能权衡。
{"title":"SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora","authors":"A. Coden, D. Gruhl, Neal Lewis, M. Tanenblatt, J. Terdiman","doi":"10.1109/HISB.2012.16","DOIUrl":"https://doi.org/10.1109/HISB.2012.16","url":null,"abstract":"Although structured electronic health records are becoming more prevalent, much information about patient health is still recorded only in unstructured text. “Understanding” these texts has been a focus of natural language processing (NLP) research for many years, with some remarkable successes, yet there is more work to be done. Knowing the drugs patients take is not only critical for understanding patient health (e.g., for drug-drug interactions or drug-enzyme interaction), but also for secondary uses, such as research on treatment effectiveness. Several drug dictionaries have been curated, such as RxNorm, FDA's Orange Book, or NCI, with a focus on prescription drugs. Developing these dictionaries is a challenge, but even more challenging is keeping these dictionaries up-to-date in the face of a rapidly advancing field-it is critical to identify grapefruit as a “drug” for a patient who takes the prescription medicine Lipitor, due to their known adverse interaction. To discover other, new adverse drug interactions, a large number of patient histories often need to be examined, necessitating not only accurate but also fast algorithms to identify pharmacological substances. In this paper we propose a new algorithm, SPOT, which identifies drug names that can be used as new dictionary entries from a large corpus, where a “drug” is defined as a substance intended for use in the diagnosis, cure, mitigation, treatment, or prevention of disease. Measured against a manually annotated reference corpus, we present precision and recall values for SPOT. SPOT is language and syntax independent, can be run efficiently to keep dictionaries up-to-date and to also suggest words and phrases which may be misspellings or uncatalogued synonyms of a known drug. We show how SPOT's lack of reliance on NLP tools makes it robust in analyzing clinical medical text. SPOT is a generalized bootstrapping algorithm, seeded with a known dictionary and automatically extracting the context within which each drug is mentioned. We define three features of such context: support, confidence and prevalence. Finally, we present the performance tradeoffs depending on the thresholds chosen for these features.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133473641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
A Robust Feature Selection Method for Novel Pre-microRNA Identification Using a Combination of Nucleotide-Structure Triplets 一种基于核苷酸结构三联体组合的新型Pre-microRNA鉴定鲁棒特征选择方法
Petra Stepanowsky, Jihoon Kim, L. Ohno-Machado
MicroRNAs are a class of small non-coding RNAs that play an important role in post-transcriptional regulation of gene products. Identification of novel microRNA is difficult because the validated microRNA set is still small in size and diverse. Existing feature selection methods use different combinations of features related to the biogenesis of microRNAs, but performance evaluations are not comprehensive. We developed a robust feature selection method using a combination of three types of nucleotide-structure triplets, the minimum free energy of the secondary structure of precursor microRNAs and other extracted characteristics. We compared our new combination feature set and three other previously published sets using three different classifiers: logistic regression, support vector machine, and random forest. Our proposed feature set was not only robust across all classifier methods, but also had the highest classification performance, as measured by the area under the ROC curve.
MicroRNAs是一类小的非编码rna,在基因产物的转录后调控中起重要作用。鉴定新的microRNA是困难的,因为验证的microRNA集仍然小的尺寸和多样性。现有的特征选择方法使用与microrna生物发生相关的特征的不同组合,但性能评估并不全面。我们开发了一种强大的特征选择方法,使用三种类型的核苷酸结构三联体,前体microRNAs二级结构的最小自由能和其他提取特征的组合。我们使用三种不同的分类器:逻辑回归、支持向量机和随机森林,将我们的新组合特征集和其他三个先前发表的特征集进行了比较。我们提出的特征集不仅在所有分类器方法中都具有鲁棒性,而且通过ROC曲线下的面积来衡量,还具有最高的分类性能。
{"title":"A Robust Feature Selection Method for Novel Pre-microRNA Identification Using a Combination of Nucleotide-Structure Triplets","authors":"Petra Stepanowsky, Jihoon Kim, L. Ohno-Machado","doi":"10.1109/HISB.2012.20","DOIUrl":"https://doi.org/10.1109/HISB.2012.20","url":null,"abstract":"MicroRNAs are a class of small non-coding RNAs that play an important role in post-transcriptional regulation of gene products. Identification of novel microRNA is difficult because the validated microRNA set is still small in size and diverse. Existing feature selection methods use different combinations of features related to the biogenesis of microRNAs, but performance evaluations are not comprehensive. We developed a robust feature selection method using a combination of three types of nucleotide-structure triplets, the minimum free energy of the secondary structure of precursor microRNAs and other extracted characteristics. We compared our new combination feature set and three other previously published sets using three different classifiers: logistic regression, support vector machine, and random forest. Our proposed feature set was not only robust across all classifier methods, but also had the highest classification performance, as measured by the area under the ROC curve.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121923152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Knowledge-Based Biomedical Word Sense Disambiguation: An Evaluation and Application to Clinical Document Classification 基于知识的生物医学词义消歧:评价及其在临床文献分类中的应用
Vijay Garla, C. Brandt
Motivation: Word Sense Disambiguation (WSD) methods automatically assign an unambiguous concept to an ambiguous term based on context, and are important to many text processing tasks. In this study, we developed and evaluated a knowledge-based WSD method that uses semantic similarity measures derived from the Unified Medical Language System (UMLS), and we evaluated the contribution of WSD to clinical text classification. Results: We evaluated our system on biomedical WSD datasets; our system compares favorably to other knowledge-based methods. We evaluated the contribution of our WSD system to clinical document classification on the 2007 Computational Medicine Challenge corpus. Machine learning classifiers trained on disambiguated concepts significantly outperformed those trained using all concepts. Availability: We integrated our WSD system with MetaMap and cTAKES, two popular biomedical natural language processing systems. We released all code required to reproduce our results and all tools developed as part of this study as open source, available under http://code.google.com/p/ytex.
动机:词义消歧(WSD)方法根据上下文自动将一个明确的概念分配给一个模糊的术语,这对许多文本处理任务都很重要。在本研究中,我们开发并评估了一种基于知识的WSD方法,该方法使用来自统一医学语言系统(UMLS)的语义相似性度量,并评估了WSD对临床文本分类的贡献。结果:我们在生物医学WSD数据集上评估了我们的系统;与其他基于知识的方法相比,我们的系统具有优势。我们评估了WSD系统在2007年计算医学挑战语料库上对临床文献分类的贡献。使用消歧概念训练的机器学习分类器明显优于使用所有概念训练的机器学习分类器。可用性:我们将WSD系统与MetaMap和cTAKES这两种流行的生物医学自然语言处理系统集成在一起。我们发布了复制我们的结果所需的所有代码,以及作为这项研究的一部分开发的所有工具作为开源,可在http://code.google.com/p/ytex下获得。
{"title":"Knowledge-Based Biomedical Word Sense Disambiguation: An Evaluation and Application to Clinical Document Classification","authors":"Vijay Garla, C. Brandt","doi":"10.1109/HISB.2012.12","DOIUrl":"https://doi.org/10.1109/HISB.2012.12","url":null,"abstract":"Motivation: Word Sense Disambiguation (WSD) methods automatically assign an unambiguous concept to an ambiguous term based on context, and are important to many text processing tasks. In this study, we developed and evaluated a knowledge-based WSD method that uses semantic similarity measures derived from the Unified Medical Language System (UMLS), and we evaluated the contribution of WSD to clinical text classification. Results: We evaluated our system on biomedical WSD datasets; our system compares favorably to other knowledge-based methods. We evaluated the contribution of our WSD system to clinical document classification on the 2007 Computational Medicine Challenge corpus. Machine learning classifiers trained on disambiguated concepts significantly outperformed those trained using all concepts. Availability: We integrated our WSD system with MetaMap and cTAKES, two popular biomedical natural language processing systems. We released all code required to reproduce our results and all tools developed as part of this study as open source, available under http://code.google.com/p/ytex.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126260675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Text Mining for Personal Health Information on Twitter Twitter上个人健康信息的文本挖掘
Marina Sokolova, Yasser Jafer, D. Schramm
With millions people discussing their Personal Health Information (PHI) online, there is a need for the development of tools that can extract and analyze such information. We introduce two semantic-based methods for mining PHI. One method uses WordNet as a source of health-related knowledge, another - terms of personal relations. Incorporating semantics gives a significant improvement in retrieval of text with PHI (paired t-test, P = 0.0001).
随着数百万人在线讨论他们的个人健康信息(PHI),需要开发能够提取和分析此类信息的工具。我们介绍了两种基于语义的PHI挖掘方法。一种方法是使用WordNet作为健康相关知识的来源,另一种方法是使用个人关系。结合语义可以显著改善使用PHI的文本检索(配对t检验,P = 0.0001)。
{"title":"Text Mining for Personal Health Information on Twitter","authors":"Marina Sokolova, Yasser Jafer, D. Schramm","doi":"10.1109/HISB.2012.37","DOIUrl":"https://doi.org/10.1109/HISB.2012.37","url":null,"abstract":"With millions people discussing their Personal Health Information (PHI) online, there is a need for the development of tools that can extract and analyze such information. We introduce two semantic-based methods for mining PHI. One method uses WordNet as a source of health-related knowledge, another - terms of personal relations. Incorporating semantics gives a significant improvement in retrieval of text with PHI (paired t-test, P = 0.0001).","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"300 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120885937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Designing Clinical Data Presentation in Electronic Dental Records Using Cognitive Task Analysis Methods 使用认知任务分析方法设计电子牙科记录的临床数据呈现
T. Thyvalikakath, Michael P. Dziabiak, Raymond Johnson, M. Torres-Urquidy, J. Yabes, T. Schleyer
Despite the many decades of research on the effective development of clinical systems in medicine, the adoption of health information technology to improve patient care continues to be slow, especially in ambulatory settings. This applies to dentistry as well, a primary care discipline with approximately 137,000 practitioners in the United States. A critical reason for slow adoption is the poor usability of clinical systems, which makes it difficult for providers to navigate through the information and obtain an integrated view of patient data. Cognitive science methods have shown significant promise to meaningfully inform the design, development and assessment of clinical information systems. In most cases, these methods have been applied to evaluate the design of systems after they have been developed. Very few studies, on the other hand, have used cognitive engineering methods to support the design process for a system itself. It is this gap in knowledge how cognitive engineering methods can be optimally applied to inform the system design process that our research seeks to address. This project studied the cognitive processes and in-formation management strategies used by dentists during a typical patient exam and applied the results to inform the design of an electronic dental record interface. The results of this study will contribute to designing clinical systems that improve cognitive support for clinicians during patient care. Such a system has the potential to enhance the quality and safety of patient care, as well as reduce healthcare costs.
尽管对医学临床系统的有效发展进行了数十年的研究,但采用卫生信息技术来改善患者护理的速度仍然很慢,特别是在门诊环境中。这也适用于牙科,这是一门初级保健学科,在美国大约有137,000名从业人员。采用缓慢的一个关键原因是临床系统的可用性差,这使得提供者难以浏览信息并获得患者数据的综合视图。认知科学方法在临床信息系统的设计、开发和评估方面显示出了重要的前景。在大多数情况下,这些方法已被应用于评估系统的设计后,他们已经开发。另一方面,很少有研究使用认知工程方法来支持系统本身的设计过程。我们的研究试图解决的问题就是认知工程方法如何最佳地应用于系统设计过程。本项目研究了牙医在典型患者检查过程中使用的认知过程和信息管理策略,并将结果应用于电子牙科记录界面的设计。本研究的结果将有助于设计临床系统,以改善临床医生在患者护理过程中的认知支持。这样的系统有可能提高病人护理的质量和安全性,并降低医疗成本。
{"title":"Designing Clinical Data Presentation in Electronic Dental Records Using Cognitive Task Analysis Methods","authors":"T. Thyvalikakath, Michael P. Dziabiak, Raymond Johnson, M. Torres-Urquidy, J. Yabes, T. Schleyer","doi":"10.1109/HISB.2012.24","DOIUrl":"https://doi.org/10.1109/HISB.2012.24","url":null,"abstract":"Despite the many decades of research on the effective development of clinical systems in medicine, the adoption of health information technology to improve patient care continues to be slow, especially in ambulatory settings. This applies to dentistry as well, a primary care discipline with approximately 137,000 practitioners in the United States. A critical reason for slow adoption is the poor usability of clinical systems, which makes it difficult for providers to navigate through the information and obtain an integrated view of patient data. Cognitive science methods have shown significant promise to meaningfully inform the design, development and assessment of clinical information systems. In most cases, these methods have been applied to evaluate the design of systems after they have been developed. Very few studies, on the other hand, have used cognitive engineering methods to support the design process for a system itself. It is this gap in knowledge how cognitive engineering methods can be optimally applied to inform the system design process that our research seeks to address. This project studied the cognitive processes and in-formation management strategies used by dentists during a typical patient exam and applied the results to inform the design of an electronic dental record interface. The results of this study will contribute to designing clinical systems that improve cognitive support for clinicians during patient care. Such a system has the potential to enhance the quality and safety of patient care, as well as reduce healthcare costs.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121040671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Identifying Provider Counseling Practices Using Natural Language Processing: Gout Example 识别提供者咨询实践使用自然语言处理:痛风的例子
Olga V. Patterson, G. Kerr, J. Richards, C. Nunziato, D. Maron, R. Amdur, S. Duvall
National guidelines for a number of health conditions recommend that practitioners assess and reinforce patient's adherence to specific diet and lifestyle modifications. Counseling intervention has shown to have a long-term positive effect on patient adherence but the extent to which physicians comply is unknown. Evidence of counseling provided by practitioner is recorded only as free text in electronic medical records. To identify physicians' counseling practices we developed a natural language processing system to detect text documentation of dietary counseling in gout patients.
针对一些健康状况的国家指南建议从业人员评估并加强患者对特定饮食和生活方式改变的依从性。咨询干预已显示对患者依从性有长期的积极影响,但医生依从的程度尚不清楚。医生提供的咨询证据仅以免费文本形式记录在电子病历中。为了识别医生的咨询实践,我们开发了一个自然语言处理系统来检测痛风患者饮食咨询的文本文档。
{"title":"Identifying Provider Counseling Practices Using Natural Language Processing: Gout Example","authors":"Olga V. Patterson, G. Kerr, J. Richards, C. Nunziato, D. Maron, R. Amdur, S. Duvall","doi":"10.1109/HISB.2012.52","DOIUrl":"https://doi.org/10.1109/HISB.2012.52","url":null,"abstract":"National guidelines for a number of health conditions recommend that practitioners assess and reinforce patient's adherence to specific diet and lifestyle modifications. Counseling intervention has shown to have a long-term positive effect on patient adherence but the extent to which physicians comply is unknown. Evidence of counseling provided by practitioner is recorded only as free text in electronic medical records. To identify physicians' counseling practices we developed a natural language processing system to detect text documentation of dietary counseling in gout patients.","PeriodicalId":375089,"journal":{"name":"2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115508680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1