首页 > 最新文献

Summit on translational bioinformatics最新文献

英文 中文
Facilitating health data sharing across diverse practices and communities. 促进不同做法和社区之间的卫生数据共享。
Ching-Ping Lin, Robert A Black, Jay Laplante, Gina A Keppel, Leah Tuzzio, Alfred O Berg, Ron J Whitener, Dedra S Buchwald, Laura-Mae Baldwin, Paul A Fishman, Sarah M Greene, John H Gennari, Peter Tarczy-Hornoch, Kari A Stephens

Health data sharing with and among practices is a method for engaging rural and underserved populations, often with strong histories of marginalization, in health research. The Institute of Translational Health Sciences, funded by a National Institutes of Health Clinical and Translational Science Award, is engaged in the LC Data QUEST project to build practice and community based research networks with the ability to share semantically aligned electronic health data. We visited ten practices and communities to assess the feasibility of and barriers to developing data sharing networks. We found that these sites had very different approaches and expectations for data sharing. In order to support practices and communities and foster the acceptance of data sharing in these settings, informaticists must take these diverse views into account. Based on these findings, we discuss system design implications and the need for flexibility in the development of community-based data sharing networks.

各实践之间和各实践之间的卫生数据共享是使农村和服务不足人口(往往有严重的边缘化历史)参与卫生研究的一种方法。由美国国立卫生研究院临床和转化科学奖资助的转化健康科学研究所,参与了LC数据QUEST项目,以建立基于实践和社区的研究网络,能够共享语义一致的电子健康数据。我们访问了十个实践和社区,以评估发展数据共享网络的可行性和障碍。我们发现,这些网站在数据共享方面有着非常不同的方法和期望。为了支持实践和社区,并促进在这些环境中接受数据共享,信息学家必须考虑到这些不同的观点。基于这些发现,我们讨论了基于社区的数据共享网络发展对系统设计的影响和灵活性的需求。
{"title":"Facilitating health data sharing across diverse practices and communities.","authors":"Ching-Ping Lin,&nbsp;Robert A Black,&nbsp;Jay Laplante,&nbsp;Gina A Keppel,&nbsp;Leah Tuzzio,&nbsp;Alfred O Berg,&nbsp;Ron J Whitener,&nbsp;Dedra S Buchwald,&nbsp;Laura-Mae Baldwin,&nbsp;Paul A Fishman,&nbsp;Sarah M Greene,&nbsp;John H Gennari,&nbsp;Peter Tarczy-Hornoch,&nbsp;Kari A Stephens","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Health data sharing with and among practices is a method for engaging rural and underserved populations, often with strong histories of marginalization, in health research. The Institute of Translational Health Sciences, funded by a National Institutes of Health Clinical and Translational Science Award, is engaged in the LC Data QUEST project to build practice and community based research networks with the ability to share semantically aligned electronic health data. We visited ten practices and communities to assess the feasibility of and barriers to developing data sharing networks. We found that these sites had very different approaches and expectations for data sharing. In order to support practices and communities and foster the acceptance of data sharing in these settings, informaticists must take these diverse views into account. Based on these findings, we discuss system design implications and the need for flexibility in the development of community-based data sharing networks.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"16-20"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041543/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29694349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An R package for simulation experiments evaluating clinical trial designs. 一个用于评估临床试验设计的模拟实验的R包。
Yuanyuan Wang, Roger Day

This paper presents an open-source application for evaluating competing clinical trial (CT) designs using simulations. The S4 system of classes and methods is utilized. Using object-oriented programming provides extensibility through careful, clear interface specification; using R, an open-source widely-used statistical language, makes the application extendible by the people who design CTs: biostatisticians. Four key classes define the specifications of the population models, CT designs, outcome models and evaluation criteria. Five key methods define the interfaces for generating patient baseline characteristics, stopping rule, assigning treatment, generating patient outcomes and calculating the criteria. Documentation of their connections with the user input screens, with the central simulation loop, and with each other faciliates the extensibility. New subclasses and instances of existing classes meeting these interfaces can integrate immediately into the application. To illustrate the application, we evaluate the effect of patient pharmacokinetic heterogeneity on the performance of a common Phase I "3+3" design.

本文提出了一个开源的应用程序,用于评估竞争的临床试验(CT)设计使用模拟。使用了S4类和方法系统。使用面向对象编程通过仔细、清晰的接口规范提供可扩展性;使用R,一种广泛使用的开源统计语言,使得设计ct的人:生物统计学家可以扩展应用程序。四个关键类别定义了总体模型、CT设计、结果模型和评估标准的规格。五个关键方法定义了生成患者基线特征、停止规则、分配治疗、生成患者结果和计算标准的接口。它们与用户输入屏幕、中央模拟循环以及彼此之间的连接的文档化有助于实现可扩展性。满足这些接口的新子类和现有类的实例可以立即集成到应用程序中。为了说明这一应用,我们评估了患者药代动力学异质性对普通I期“3+3”设计性能的影响。
{"title":"An R package for simulation experiments evaluating clinical trial designs.","authors":"Yuanyuan Wang,&nbsp;Roger Day","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper presents an open-source application for evaluating competing clinical trial (CT) designs using simulations. The S4 system of classes and methods is utilized. Using object-oriented programming provides extensibility through careful, clear interface specification; using R, an open-source widely-used statistical language, makes the application extendible by the people who design CTs: biostatisticians. Four key classes define the specifications of the population models, CT designs, outcome models and evaluation criteria. Five key methods define the interfaces for generating patient baseline characteristics, stopping rule, assigning treatment, generating patient outcomes and calculating the criteria. Documentation of their connections with the user input screens, with the central simulation loop, and with each other faciliates the extensibility. New subclasses and instances of existing classes meeting these interfaces can integrate immediately into the application. To illustrate the application, we evaluate the effect of patient pharmacokinetic heterogeneity on the performance of a common Phase I \"3+3\" design.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"61-5"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29693646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the impact of conceptual knowledge engineering on the design and usability of a clinical and translational science collaboration portal. 评估概念知识工程对临床和转化科学合作门户的设计和可用性的影响。
Philip R O Payne, Tara B Borlawsky, Robert Rice, Peter J Embi

With the growing prevalence of large-scale, team science endeavors in the biomedical and life science domains, the impetus to implement platforms capable of supporting asynchronous interaction among multidisciplinary groups of collaborators has increased commensurately. However, there is a paucity of literature describing systematic approaches to identifying the information needs of targeted end-users for such platforms, and the translation of such requirements into practicable software component design criteria. In previous studies, we have reported upon the efficacy of employing conceptual knowledge engineering (CKE) techniques to systematically address both of the preceding challenges in the context of complex biomedical applications. In this manuscript we evaluate the impact of CKE approaches relative to the design of a clinical and translational science collaboration portal, and report upon the preliminary qualitative users satisfaction as reported for the resulting system.

随着生物医学和生命科学领域大规模、团队科学努力的日益普及,实现能够支持多学科协作者群体之间异步交互的平台的动力也相应增加。然而,缺乏文献描述系统的方法来识别这些平台的目标最终用户的信息需求,以及将这些需求转化为可行的软件组件设计标准。在之前的研究中,我们已经报道了在复杂生物医学应用的背景下,采用概念知识工程(CKE)技术系统地解决上述两个挑战的有效性。在本文中,我们评估了CKE方法相对于临床和转化科学合作门户网站设计的影响,并报告了初步的定性用户满意度。
{"title":"Evaluating the impact of conceptual knowledge engineering on the design and usability of a clinical and translational science collaboration portal.","authors":"Philip R O Payne,&nbsp;Tara B Borlawsky,&nbsp;Robert Rice,&nbsp;Peter J Embi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>With the growing prevalence of large-scale, team science endeavors in the biomedical and life science domains, the impetus to implement platforms capable of supporting asynchronous interaction among multidisciplinary groups of collaborators has increased commensurately. However, there is a paucity of literature describing systematic approaches to identifying the information needs of targeted end-users for such platforms, and the translation of such requirements into practicable software component design criteria. In previous studies, we have reported upon the efficacy of employing conceptual knowledge engineering (CKE) techniques to systematically address both of the preceding challenges in the context of complex biomedical applications. In this manuscript we evaluate the impact of CKE approaches relative to the design of a clinical and translational science collaboration portal, and report upon the preliminary qualitative users satisfaction as reported for the resulting system.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"41-5"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041529/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29693717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A knowledge extraction framework for biomedical pathways. 生物医学路径的知识提取框架。
Sanda Harabagiu, Cosmin Adrian Bejan

In this paper we present a novel knowledge extraction framework that is based on semantic parsing. The semantic information originates in a variety of resources, but one in particular, namely BioFrameNet, is central to the characterization of complex events and processes that form biomedical pathways. The paper discusses the promising results of semantic parsing and explains how these results can be used for capturing complex medical knowledge.

本文提出了一种基于语义解析的知识抽取框架。语义信息来源于多种资源,但有一种资源,即BioFrameNet,对形成生物医学途径的复杂事件和过程的表征至关重要。本文讨论了语义分析的前景,并解释了这些结果如何用于捕获复杂的医学知识。
{"title":"A knowledge extraction framework for biomedical pathways.","authors":"Sanda Harabagiu,&nbsp;Cosmin Adrian Bejan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In this paper we present a novel knowledge extraction framework that is based on semantic parsing. The semantic information originates in a variety of resources, but one in particular, namely BioFrameNet, is central to the characterization of complex events and processes that form biomedical pathways. The paper discusses the promising results of semantic parsing and explains how these results can be used for capturing complex medical knowledge.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041549/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29694343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The human studies database project: federating human studies design data using the ontology of clinical research. 人类研究数据库项目:使用临床研究本体联合人类研究设计数据。
Ida Sim, Simona Carini, Samson Tu, Rob Wynden, Brad H Pollock, Shamim A Mollah, Davera Gabriel, Herbert K Hagler, Richard H Scheuermann, Harold P Lehmann, Knut M Wittkowski, Meredith Nahm, Suzanne Bakken

Human studies, encompassing interventional and observational studies, are the most important source of evidence for advancing our understanding of health, disease, and treatment options. To promote discovery, the design and results of these studies should be made machine-readable for large-scale data mining, synthesis, and re-analysis. The Human Studies Database Project aims to define and implement an informatics infrastructure for institutions to share the design of their human studies. We have developed the Ontology of Clinical Research (OCRe) to model study features such as design type, interventions, and outcomes to support scientific query and analysis. We are using OCRe as the reference semantics for federated data sharing of human studies over caGrid, and are piloting this implementation with several Clinical and Translational Science Award (CTSA) institutions.

人类研究,包括介入性和观察性研究,是促进我们对健康、疾病和治疗方案理解的最重要的证据来源。为了促进发现,这些研究的设计和结果应该是机器可读的,以便进行大规模的数据挖掘、综合和再分析。人类研究数据库计划旨在定义和实施一个信息基础设施,供各院校分享其人类研究的设计。我们开发了临床研究本体论(OCRe)来模拟研究特征,如设计类型、干预措施和结果,以支持科学查询和分析。我们正在使用OCRe作为参考语义,在caGrid上进行人类研究的联合数据共享,并正在与几个临床和转化科学奖(CTSA)机构进行试点。
{"title":"The human studies database project: federating human studies design data using the ontology of clinical research.","authors":"Ida Sim,&nbsp;Simona Carini,&nbsp;Samson Tu,&nbsp;Rob Wynden,&nbsp;Brad H Pollock,&nbsp;Shamim A Mollah,&nbsp;Davera Gabriel,&nbsp;Herbert K Hagler,&nbsp;Richard H Scheuermann,&nbsp;Harold P Lehmann,&nbsp;Knut M Wittkowski,&nbsp;Meredith Nahm,&nbsp;Suzanne Bakken","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Human studies, encompassing interventional and observational studies, are the most important source of evidence for advancing our understanding of health, disease, and treatment options. To promote discovery, the design and results of these studies should be made machine-readable for large-scale data mining, synthesis, and re-analysis. The Human Studies Database Project aims to define and implement an informatics infrastructure for institutions to share the design of their human studies. We have developed the Ontology of Clinical Research (OCRe) to model study features such as design type, interventions, and outcomes to support scientific query and analysis. We are using OCRe as the reference semantics for federated data sharing of human studies over caGrid, and are piloting this implementation with several Clinical and Translational Science Award (CTSA) institutions.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"51-5"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041546/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29693644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An automated approach to calculating the daily dose of tacrolimus in electronic health records. 电子健康记录中计算他克莫司日剂量的自动方法。
Hua Xu, Son Doan, Kelly A Birdwell, James D Cowan, Andrew J Vincz, David W Haas, Melissa A Basford, Joshua C Denny

Clinical research often requires extracting detailed drug information, such as medication names and dosages, from Electronic Health Records (EHR). Since medication information is often recorded as both structured and unstructured formats in the EHR, extracting all the relevant drug mentions and determining the daily dose of a medication for a selected patient at a given date can be a challenging and time-consuming task. In this paper, we present an automated approach using natural language processing to calculate daily doses of medications mentioned in clinical text, using tacrolimus as a test case. We evaluated this method using data sets from four different types of unstructured clinical data. Our results showed that the system achieved precisions of 0.90-1.00 and recalls of 0.81-1.00.

临床研究通常需要从电子健康记录(EHR)中提取详细的药物信息,例如药物名称和剂量。由于药物信息通常以结构化和非结构化格式记录在EHR中,因此提取所有相关药物提及并确定特定患者在给定日期的药物日剂量可能是一项具有挑战性且耗时的任务。在本文中,我们提出了一种使用自然语言处理的自动化方法来计算临床文本中提到的药物的每日剂量,使用他克莫司作为测试案例。我们使用来自四种不同类型的非结构化临床数据集来评估这种方法。结果表明,该系统的精密度为0.90 ~ 1.00,召回率为0.81 ~ 1.00。
{"title":"An automated approach to calculating the daily dose of tacrolimus in electronic health records.","authors":"Hua Xu,&nbsp;Son Doan,&nbsp;Kelly A Birdwell,&nbsp;James D Cowan,&nbsp;Andrew J Vincz,&nbsp;David W Haas,&nbsp;Melissa A Basford,&nbsp;Joshua C Denny","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Clinical research often requires extracting detailed drug information, such as medication names and dosages, from Electronic Health Records (EHR). Since medication information is often recorded as both structured and unstructured formats in the EHR, extracting all the relevant drug mentions and determining the daily dose of a medication for a selected patient at a given date can be a challenging and time-consuming task. In this paper, we present an automated approach using natural language processing to calculate daily doses of medications mentioned in clinical text, using tacrolimus as a test case. We evaluated this method using data sets from four different types of unstructured clinical data. Our results showed that the system achieved precisions of 0.90-1.00 and recalls of 0.81-1.00.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"71-5"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041548/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29693648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PositionMatcher: A Fast Custom-Annotation Tool for Short DNA Sequences. PositionMatcher:短DNA序列的快速自定义注释工具。
Erik Pitzer, Jihoon Kim, Kiltesh Patel, Pedro A Galante, Lucila Ohno-Machado

Microarray probes and reads from massively parallel sequencing technologies are two most widely used genomic tags for a transcriptome study. Names and underlying technologies might differ, but expression technologies share a common objective-to obtain mRNA abundance values at the gene level, with high sensitivity and specificity. However, the initial tag annotation becomes obsolete as more insight is gained into biological references (genome, transcriptome, SNP, etc.). While novel alignment algorithms for short reads are being released every month, solutions for rapid annotation of tags are rare. We have developed a generic matching algorithm that uses genomic positions for rapid custom-annotation of tags with a time complexity O(nlogn). We demonstrate our algorithm on the custom annotation of Illumina massively parallel sequencing reads and Affymetrix microarray probes and identification of alternatively spliced regions.

微阵列探针和大规模平行测序技术的读数是转录组研究中最广泛使用的两种基因组标记。名称和基础技术可能有所不同,但表达技术有一个共同的目标——在基因水平上获得具有高灵敏度和特异性的mRNA丰度值。然而,随着对生物学参考(基因组、转录组、SNP等)的深入了解,最初的标签注释变得过时了。虽然每个月都有新的短读对齐算法发布,但快速标注标签的解决方案却很少。我们开发了一种通用匹配算法,该算法使用基因组位置快速自定义标记,时间复杂度为0 (nlogn)。我们在Illumina大规模并行测序reads和Affymetrix微阵列探针的自定义注释和替代拼接区域的识别上展示了我们的算法。
{"title":"PositionMatcher: A Fast Custom-Annotation Tool for Short DNA Sequences.","authors":"Erik Pitzer,&nbsp;Jihoon Kim,&nbsp;Kiltesh Patel,&nbsp;Pedro A Galante,&nbsp;Lucila Ohno-Machado","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Microarray probes and reads from massively parallel sequencing technologies are two most widely used genomic tags for a transcriptome study. Names and underlying technologies might differ, but expression technologies share a common objective-to obtain mRNA abundance values at the gene level, with high sensitivity and specificity. However, the initial tag annotation becomes obsolete as more insight is gained into biological references (genome, transcriptome, SNP, etc.). While novel alignment algorithms for short reads are being released every month, solutions for rapid annotation of tags are rare. We have developed a generic matching algorithm that uses genomic positions for rapid custom-annotation of tags with a time complexity O(nlogn). We demonstrate our algorithm on the custom annotation of Illumina massively parallel sequencing reads and Affymetrix microarray probes and identification of alternatively spliced regions.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"25-9"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041550/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29693712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated ontological gene annotation for computing disease similarity. 计算疾病相似度的自动本体论基因注释。
Sachin Mathur, Deendayal Dinakarpandian

The annotation of gene/gene products with information on associated diseases is useful as an aid to clinical diagnosis and drug discovery. Several supervised and unsupervised methods exist that automate the association of genes with diseases, but relatively little work has been done to map protein sequence data to disease terminologies. This paper augments an existing open-disease terminology, the Disease Ontology (DO), and uses it for automated annotation of Swissprot records. In addition to the inherent benefits of mapping data to a rich ontology, we demonstrate a gain of 36.1% in gene-disease associations compared to that in DO. Further, we measure disease similarity by exploiting the co-occurrence of annotation among proteins and the hierarchical structure of DO. This makes it possible to find related diseases or signs, with the potential to find previously unknown relationships.

基因/基因产物与相关疾病信息的注释有助于临床诊断和药物发现。有几种监督和无监督的方法可以自动将基因与疾病联系起来,但相对较少的工作是将蛋白质序列数据映射到疾病术语。本文扩充了现有的开放疾病术语——疾病本体(Disease Ontology, DO),并将其用于Swissprot记录的自动注释。除了将数据映射到丰富的本体的固有好处之外,我们还证明了与DO相比,基因-疾病关联的增益为36.1%。此外,我们通过利用蛋白质之间注释的共现性和DO的层次结构来测量疾病相似性。这使得有可能发现相关的疾病或迹象,并有可能发现以前未知的关系。
{"title":"Automated ontological gene annotation for computing disease similarity.","authors":"Sachin Mathur,&nbsp;Deendayal Dinakarpandian","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The annotation of gene/gene products with information on associated diseases is useful as an aid to clinical diagnosis and drug discovery. Several supervised and unsupervised methods exist that automate the association of genes with diseases, but relatively little work has been done to map protein sequence data to disease terminologies. This paper augments an existing open-disease terminology, the Disease Ontology (DO), and uses it for automated annotation of Swissprot records. In addition to the inherent benefits of mapping data to a rich ontology, we demonstrate a gain of 36.1% in gene-disease associations compared to that in DO. Further, we measure disease similarity by exploiting the co-occurrence of annotation among proteins and the hierarchical structure of DO. This makes it possible to find related diseases or signs, with the potential to find previously unknown relationships.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"12-6"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041538/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29694348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed cognition artifacts on clinical research data collection forms. 临床研究数据收集表上的分布式认知人工制品。
Meredith Nahm, Vickie D Nguyen, Elie Razzouk, Min Zhu, Jiajie Zhang

Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Cognitive factors have not been studied as a possible explanation for medical record abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms.We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.

病历摘要是二次数据使用中的一种主要数据收集方式,与高错误率有关。认知因素尚未被研究用于解释病历摘要错误。我们采用分布式表征和表征分析理论,系统地评估了病历抽取过程中的认知需求,以及临床研究数据收集表样本中采用的外部认知支持程度。此外,对于最复杂的数据元素,数据收集表格不支持外部认知。工作记忆要求高可能是数据错误与需要抽象者解释、比较、映射或计算的数据元素相关联的一个原因。这里使用的表征分析可用于识别认知要求高的数据元素。
{"title":"Distributed cognition artifacts on clinical research data collection forms.","authors":"Meredith Nahm, Vickie D Nguyen, Elie Razzouk, Min Zhu, Jiajie Zhang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Cognitive factors have not been studied as a possible explanation for medical record abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms.We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"36-40"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041537/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29693716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corpus-based Approach to Creating a Semantic Lexicon for Clinical Research Eligibility Criteria from UMLS. 基于语料库的UMLS临床研究资格标准语义词典创建方法。
Zhihui Luo, Robert Duffy, Stephen Johnson, Chunhua Weng

We describe a corpus-based approach to creating a semantic lexicon using UMLS knowledge sources. We extracted 10,000 sentences from the eligibility criteria sections of clinical trial summaries contained in ClinicalTrials.gov. The UMLS Metathesaurus and SPECIALIST Lexical Tools were used to extract and normalize UMLS recognizable terms. When annotated with Semantic Network types, the corpus had a lexical ambiguity of 1.57 (=total types for unique lexemes / total unique lexemes) and a word occurrence ambiguity of 1.96 (=total type occurrences / total word occurrences). A set of semantic preference rules was developed and applied to completely eliminate ambiguity in semantic type assignment. The lexicon covered 95.95% UMLS-recognizable terms in our corpus. A total of 20 UMLS semantic types, representing about 17% of all the distinct semantic types assigned to corpus lexemes, covered about 80% of the vocabulary of our corpus.

我们描述了一种基于语料库的方法来使用UMLS知识库创建语义词典。我们从ClinicalTrials.gov网站上的临床试验摘要的资格标准部分提取了10,000个句子。使用UMLS元词典和专业词汇工具对UMLS可识别术语进行提取和规范化。当使用语义网络类型进行标注时,语料库的词汇歧义度为1.57(=唯一词汇的总类型/唯一词汇的总数),单词出现歧义度为1.96(=总类型出现次数/总单词出现次数)。为了彻底消除语义类型分配中的歧义,提出了一套语义偏好规则。该词典涵盖了语料库中95.95%的uml可识别术语。UMLS共有20种语义类型,约占分配给语料库词汇的所有不同语义类型的17%,覆盖了语料库词汇的80%左右。
{"title":"Corpus-based Approach to Creating a Semantic Lexicon for Clinical Research Eligibility Criteria from UMLS.","authors":"Zhihui Luo,&nbsp;Robert Duffy,&nbsp;Stephen Johnson,&nbsp;Chunhua Weng","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We describe a corpus-based approach to creating a semantic lexicon using UMLS knowledge sources. We extracted 10,000 sentences from the eligibility criteria sections of clinical trial summaries contained in ClinicalTrials.gov. The UMLS Metathesaurus and SPECIALIST Lexical Tools were used to extract and normalize UMLS recognizable terms. When annotated with Semantic Network types, the corpus had a lexical ambiguity of 1.57 (=total types for unique lexemes / total unique lexemes) and a word occurrence ambiguity of 1.96 (=total type occurrences / total word occurrences). A set of semantic preference rules was developed and applied to completely eliminate ambiguity in semantic type assignment. The lexicon covered 95.95% UMLS-recognizable terms in our corpus. A total of 20 UMLS semantic types, representing about 17% of all the distinct semantic types assigned to corpus lexemes, covered about 80% of the vocabulary of our corpus.</p>","PeriodicalId":89276,"journal":{"name":"Summit on translational bioinformatics","volume":"2010 ","pages":"26-30"},"PeriodicalIF":0.0,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041551/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29693713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Summit on translational bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1