Journal of biomedical discovery and collaboration最新文献_第3页

Anne O'Tate: A tool to support user-driven summarization, drill-down and browsing of PubMed search results. 一个工具，支持用户驱动的摘要，钻取和浏览PubMed搜索结果。

Journal of biomedical discovery and collaboration

Pub Date : 2008-02-15 DOI: 10.1186/1747-5333-3-2

Neil R Smalheiser, Wei Zhou, Vetle I Torvik

Background: PubMed is designed to provide rapid, comprehensive retrieval of papers that discuss a given topic. However, because PubMed does not organize the search output further, it is difficult for users to grasp an overview of the retrieved literature according to non-topical dimensions, to drill-down to find individual articles relevant to a particular individual's need, or to browse the collection.

Results: In this paper, we present Anne O'Tate, a web-based tool that processes articles retrieved from PubMed and displays multiple aspects of the articles to the user, according to pre-defined categories such as the "most important" words found in titles or abstracts; topics; journals; authors; publication years; and affiliations. Clicking on a given item opens a new window that displays all papers that contain that item. One can navigate by drilling down through the categories progressively, e.g., one can first restrict the articles according to author name and then restrict that subset by affiliation. Alternatively, one can expand small sets of articles to display the most closely related articles. We also implemented a novel cluster-by-topic method that generates a concise set of topics covering most of the retrieved articles.

Conclusion: Anne O'Tate is an integrated, generic tool for summarization, drill-down and browsing of PubMed search results that accommodates a wide range of biomedical users and needs. It can be accessed at 4. Peer review and editorial matters for this article were handled by Aaron Cohen.

背景:PubMed旨在提供讨论给定主题的论文的快速、全面的检索。然而，由于PubMed没有进一步组织搜索输出，用户很难根据非主题维度掌握检索文献的概况，很难深入查找与特定个人需求相关的个别文章，也很难浏览集合。结果:在本文中，我们介绍了Anne O'Tate，一个基于网络的工具，它处理从PubMed检索到的文章，并根据预定义的类别(如标题或摘要中发现的“最重要”单词)向用户显示文章的多个方面;主题;期刊;作者;出版年;和从属关系。单击给定的项目将打开一个新窗口，其中显示包含该项目的所有论文。用户可以通过逐步深入分类来进行导航，例如，可以先根据作者姓名来限制文章，然后根据隶属关系来限制该子集。或者，您可以扩展小的文章集，以显示最密切相关的文章。我们还实现了一种新颖的按主题集群的方法，该方法生成了一组简明的主题，涵盖了大多数检索到的文章。结论:Anne O'Tate是一个集成的通用工具，用于总结、深入和浏览PubMed搜索结果，适应广泛的生物医学用户和需求。它可以在4处访问。本文的同行评议和编辑事宜由Aaron Cohen负责。

{"title":"Anne O'Tate: A tool to support user-driven summarization, drill-down and browsing of PubMed search results.","authors":"Neil R Smalheiser, Wei Zhou, Vetle I Torvik","doi":"10.1186/1747-5333-3-2","DOIUrl":"https://doi.org/10.1186/1747-5333-3-2","url":null,"abstract":"Background: PubMed is designed to provide rapid, comprehensive retrieval of papers that discuss a given topic. However, because PubMed does not organize the search output further, it is difficult for users to grasp an overview of the retrieved literature according to non-topical dimensions, to drill-down to find individual articles relevant to a particular individual's need, or to browse the collection.Results: In this paper, we present Anne O'Tate, a web-based tool that processes articles retrieved from PubMed and displays multiple aspects of the articles to the user, according to pre-defined categories such as the \"most important\" words found in titles or abstracts; topics; journals; authors; publication years; and affiliations. Clicking on a given item opens a new window that displays all papers that contain that item. One can navigate by drilling down through the categories progressively, e.g., one can first restrict the articles according to author name and then restrict that subset by affiliation. Alternatively, one can expand small sets of articles to display the most closely related articles. We also implemented a novel cluster-by-topic method that generates a concise set of topics covering most of the retrieved articles.Conclusion: Anne O'Tate is an integrated, generic tool for summarization, drill-down and browsing of PubMed search results that accommodates a wide range of biomedical users and needs. It can be accessed at 4. Peer review and editorial matters for this article were handled by Aaron Cohen.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-3-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"27268810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 61

An open-source framework for large-scale, flexible evaluation of biomedical text mining systems. 一个用于大规模、灵活评估生物医学文本挖掘系统的开源框架。

Journal of biomedical discovery and collaboration

Pub Date : 2008-01-29 DOI: 10.1186/1747-5333-3-1

William A Baumgartner, K Bretonnel Cohen, Lawrence Hunter

Background: Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain.

Results: Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision.

Conclusion: The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net.

背景:改进的评价方法已被确定为改进文本挖掘理论和实践的必要前提。本文提出了一个公开可用的框架，它有助于对文本挖掘技术进行全面、结构化和大规模的评估。该框架的可扩展性及其通过分析组件部件揭示系统范围特征的能力，以及它在促进第三方应用程序集成方面的有用性，都通过生物医学领域的示例得到了证明。结果:我们的评估框架是使用非结构化信息管理体系结构组装的。运用该方法对225个系统、评价语料库和正确性测度组合的基因提及识别系统进行了分析。研究发现，这三者之间的相互作用会影响系统的相对排名。第二个实验评估了基因规范化系统的性能，使用4097个基因提及系统和基因提及系统组合策略的组合作为输入。研究表明，基因提及系统召回对基因规范化系统性能的影响远大于基因提及系统精度的影响，而高的基因规范化性能可以在极低的基因提及系统精度水平下实现。结论:本文展示的软件展示了生物医学语言处理系统的结构化评估所带来的新发现的潜力，以及这种评估框架对促进生物医学语言处理技术开发人员之间协作的有用性。代码库是SourceForge.net上BioNLP UIMA组件存储库的一部分。

{"title":"An open-source framework for large-scale, flexible evaluation of biomedical text mining systems.","authors":"William A Baumgartner, K Bretonnel Cohen, Lawrence Hunter","doi":"10.1186/1747-5333-3-1","DOIUrl":"https://doi.org/10.1186/1747-5333-3-1","url":null,"abstract":"Background: Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain.Results: Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision.Conclusion: The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2008-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-3-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"27225842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 56

Generalization through similarity: motif discourse in the discovery and elaboration of zinc finger proteins. 相似归纳:锌指蛋白发现与阐述中的基序话语。

Journal of biomedical discovery and collaboration

Pub Date : 2007-10-03 DOI: 10.1186/1747-5333-2-5

Celeste Michelle Condit, L Bruce Railsback

Background: Biological organisms and their components are better conceived within categories based on similarity rather than on identity. Biologists routinely operate with similarity-based concepts such as "model organism" and "motif." There has been little exploration of the characteristics of the similarity-based categories that exist in biology. This study uses the case of the discovery and classification of zinc finger proteins to explore how biological categories based in similarity are represented.

Results: The existence of a category of "zinc finger proteins" was based in 1) a lumpy gradient of similarity, 2) a link between function and structure, 3) establishment of a range of appearance across systems and organisms, and 4) an evolutionary locus as a historically based common-ground.

Conclusion: More systematic application of the idea of similarity-based categorization might eliminate the assumption that biological characteristics can only contribute to narrow categorization of humans. It also raises possibilities for refining data-driven exploration efforts.

背景:生物有机体及其组成部分的分类最好是基于相似性而不是基于同一性。生物学家通常使用基于相似性的概念，如“模式生物”和“基序”。对存在于生物学中的以相似性为基础的分类特征的探索很少。本研究使用锌指蛋白的发现和分类的情况下，探讨如何基于相似性的生物类别表示。结果:一类“锌指蛋白”的存在是基于1)相似性的梯度，2)功能和结构之间的联系，3)跨系统和生物体的一系列外观的建立，以及4)作为历史基础的共同点的进化位点。结论:更系统地应用基于相似性的分类思想可能会消除生物特征只能对人类进行狭隘分类的假设。它还提高了改进数据驱动勘探工作的可能性。

{"title":"Generalization through similarity: motif discourse in the discovery and elaboration of zinc finger proteins.","authors":"Celeste Michelle Condit, L Bruce Railsback","doi":"10.1186/1747-5333-2-5","DOIUrl":"https://doi.org/10.1186/1747-5333-2-5","url":null,"abstract":"Background: Biological organisms and their components are better conceived within categories based on similarity rather than on identity. Biologists routinely operate with similarity-based concepts such as \"model organism\" and \"motif.\" There has been little exploration of the characteristics of the similarity-based categories that exist in biology. This study uses the case of the discovery and classification of zinc finger proteins to explore how biological categories based in similarity are represented.Results: The existence of a category of \"zinc finger proteins\" was based in 1) a lumpy gradient of similarity, 2) a link between function and structure, 3) establishment of a range of appearance across systems and organisms, and 4) an evolutionary locus as a historically based common-ground.Conclusion: More systematic application of the idea of similarity-based categorization might eliminate the assumption that biological characteristics can only contribute to narrow categorization of humans. It also raises possibilities for refining data-driven exploration efforts.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-2-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"27029359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Corpus refactoring: a feasibility study. 语料库重构:可行性研究。

Journal of biomedical discovery and collaboration

Pub Date : 2007-09-13 DOI: 10.1186/1747-5333-2-4

Helen L Johnson, William A Baumgartner, Martin Krallinger, K Bretonnel Cohen, Lawrence Hunter

Background: Most biomedical corpora have not been used outside of the lab that created them, despite the fact that the availability of the gold-standard evaluation data that they provide is one of the rate-limiting factors for the progress of biomedical text mining. Data suggest that one major factor affecting the use of a corpus outside of its home laboratory is the format in which it is distributed. This paper tests the hypothesis that corpus refactoring - changing the format of a corpus without altering its semantics - is a feasible goal, namely that it can be accomplished with a semi-automatable process and in a time-effcient way. We used simple text processing methods and limited human validation to convert the Protein Design Group corpus into two new formats: WordFreak and embedded XML. We tracked the total time expended and the success rates of the automated steps.

Results: The refactored corpus is available for download at the BioNLP SourceForge website http://bionlp.sourceforge.net. The total time expended was just over three person-weeks, consisting of about 102 hours of programming time (much of which is one-time development cost) and 20 hours of manual validation of automatic outputs. Additionally, the steps required to refactor any corpus are presented.

Conclusion: We conclude that refactoring of publicly available corpora is a technically and economically feasible method for increasing the usage of data already available for evaluating biomedical language processing systems.

背景:大多数生物医学语料库还没有在创建它们的实验室之外使用，尽管它们提供的金标准评估数据的可用性是生物医学文本挖掘进展的限速因素之一。数据表明，影响语料库在其家庭实验室之外使用的一个主要因素是语料库的分发格式。本文测试了语料库重构——在不改变语料库语义的情况下改变语料库的格式——是一个可行的目标的假设，即它可以用半自动化的过程和时间效率的方式来完成。我们使用简单的文本处理方法和有限的人工验证将Protein Design Group语料库转换为两种新格式:WordFreak和嵌入式XML。我们跟踪了所花费的总时间和自动化步骤的成功率。结果:重构的语料库可在BioNLP SourceForge网站http://bionlp.sourceforge.net下载。花费的总时间刚刚超过3人周，包括大约102小时的编程时间(其中大部分是一次性开发成本)和20小时的自动输出的手动验证。此外，还介绍了重构语料库所需的步骤。结论:我们得出结论，重构公共可用的语料库是一种技术和经济上可行的方法，可以增加已有数据的使用，用于评估生物医学语言处理系统。

{"title":"Corpus refactoring: a feasibility study.","authors":"Helen L Johnson, William A Baumgartner, Martin Krallinger, K Bretonnel Cohen, Lawrence Hunter","doi":"10.1186/1747-5333-2-4","DOIUrl":"https://doi.org/10.1186/1747-5333-2-4","url":null,"abstract":"Background: Most biomedical corpora have not been used outside of the lab that created them, despite the fact that the availability of the gold-standard evaluation data that they provide is one of the rate-limiting factors for the progress of biomedical text mining. Data suggest that one major factor affecting the use of a corpus outside of its home laboratory is the format in which it is distributed. This paper tests the hypothesis that corpus refactoring - changing the format of a corpus without altering its semantics - is a feasible goal, namely that it can be accomplished with a semi-automatable process and in a time-effcient way. We used simple text processing methods and limited human validation to convert the Protein Design Group corpus into two new formats: WordFreak and embedded XML. We tracked the total time expended and the success rates of the automated steps.Results: The refactored corpus is available for download at the BioNLP SourceForge website http://bionlp.sourceforge.net. The total time expended was just over three person-weeks, consisting of about 102 hours of programming time (much of which is one-time development cost) and 20 hours of manual validation of automatic outputs. Additionally, the steps required to refactor any corpus are presented.Conclusion: We conclude that refactoring of publicly available corpora is a technically and economically feasible method for increasing the usage of data already available for evaluating biomedical language processing systems.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-2-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40962167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Nano-Bio-Genesis: tracing the rise of nanotechnology and nanobiotechnology as 'big science'. 纳米生物起源:追踪纳米技术和纳米生物技术作为“大科学”的兴起。

Journal of biomedical discovery and collaboration

Pub Date : 2007-07-14 DOI: 10.1186/1747-5333-2-3

Rajan P Kulkarni

Nanotechnology research has lately been of intense interest because of its perceived potential for many diverse fields of science. Nanotechnology's tools have found application in diverse fields, from biology to device physics. By the 1990s, there was a concerted effort in the United States to develop a national initiative to promote such research. The success of this effort led to a significant influx of resources and interest in nanotechnology and nanobiotechnology and to the establishment of centralized research programs and facilities. Further government initiatives (at federal, state, and local levels) have firmly cemented these disciplines as 'big science,' with efforts increasingly concentrated at select laboratories and centers. In many respects, these trends mirror certain changes in academic science over the past twenty years, with a greater emphasis on applied science and research that can be more directly utilized for commercial applications.We also compare the National Nanotechnology Initiative and its successors to the Human Genome Project, another large-scale, government funded initiative. These precedents made acceptance of shifts in nanotechnology easier for researchers to accept, as they followed trends already established within most fields of science. Finally, these trends are examined in the design of technologies for detection and treatment of cancer, through the Alliance for Nanotechnology in Cancer initiative of the National Cancer Institute. Federal funding of these nanotechnology initiatives has allowed for expansion into diverse fields and the impetus for expanding the scope of research of several fields, especially biomedicine, though the ultimate utility and impact of all these efforts remains to be seen.

纳米技术的研究最近引起了人们的强烈兴趣，因为人们认为纳米技术在许多不同的科学领域具有潜力。纳米技术的工具已经在从生物学到设备物理学的各个领域得到了应用。到20世纪90年代，美国一致努力制定一项国家倡议，以促进这类研究。这一努力的成功导致了大量资源和对纳米技术和纳米生物技术的兴趣的涌入，并建立了集中的研究计划和设施。进一步的政府举措(联邦、州和地方各级)已将这些学科牢固地巩固为“大科学”，并越来越多地将努力集中在选定的实验室和中心。在许多方面，这些趋势反映了过去二十年来学术科学的某些变化，更加强调可以更直接地用于商业应用的应用科学和研究。我们还将国家纳米技术计划及其后继计划与另一个由政府资助的大规模计划——人类基因组计划进行了比较。这些先例使得科学家更容易接受纳米技术的转变，因为它们遵循了大多数科学领域已经确立的趋势。最后，通过国家癌症研究所的纳米技术癌症联盟倡议，在癌症检测和治疗技术的设计中检查了这些趋势。联邦政府对这些纳米技术项目的资助使其得以扩展到不同的领域，并推动了几个领域的研究范围的扩大，尤其是生物医学，尽管所有这些努力的最终效用和影响还有待观察。

{"title":"Nano-Bio-Genesis: tracing the rise of nanotechnology and nanobiotechnology as 'big science'.","authors":"Rajan P Kulkarni","doi":"10.1186/1747-5333-2-3","DOIUrl":"https://doi.org/10.1186/1747-5333-2-3","url":null,"abstract":" Nanotechnology research has lately been of intense interest because of its perceived potential for many diverse fields of science. Nanotechnology's tools have found application in diverse fields, from biology to device physics. By the 1990s, there was a concerted effort in the United States to develop a national initiative to promote such research. The success of this effort led to a significant influx of resources and interest in nanotechnology and nanobiotechnology and to the establishment of centralized research programs and facilities. Further government initiatives (at federal, state, and local levels) have firmly cemented these disciplines as 'big science,' with efforts increasingly concentrated at select laboratories and centers. In many respects, these trends mirror certain changes in academic science over the past twenty years, with a greater emphasis on applied science and research that can be more directly utilized for commercial applications.We also compare the National Nanotechnology Initiative and its successors to the Human Genome Project, another large-scale, government funded initiative. These precedents made acceptance of shifts in nanotechnology easier for researchers to accept, as they followed trends already established within most fields of science. Finally, these trends are examined in the design of technologies for detection and treatment of cancer, through the Alliance for Nanotechnology in Cancer initiative of the National Cancer Institute. Federal funding of these nanotechnology initiatives has allowed for expansion into diverse fields and the impetus for expanding the scope of research of several fields, especially biomedicine, though the ultimate utility and impact of all these efforts remains to be seen.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-2-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26830006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Applied information retrieval and multidisciplinary research: new mechanistic hypotheses in complex regional pain syndrome. 应用信息检索与多学科研究:复杂局部疼痛综合征的新机制假说。

Journal of biomedical discovery and collaboration

Pub Date : 2007-05-04 DOI: 10.1186/1747-5333-2-2

Kristina M Hettne, Marissa de Mos, Anke G J de Bruijn, Marc Weeber, Scott Boyer, Erik M van Mulligen, Montserrat Cases, Jordi Mestres, Johan van der Lei

Background: Collaborative efforts of physicians and basic scientists are often necessary in the investigation of complex disorders. Difficulties can arise, however, when large amounts of information need to reviewed. Advanced information retrieval can be beneficial in combining and reviewing data obtained from the various scientific fields. In this paper, a team of investigators with varying backgrounds has applied advanced information retrieval methods, in the form of text mining and entity relationship tools, to review the current literature, with the intention to generate new insights into the molecular mechanisms underlying a complex disorder. As an example of such a disorder the Complex Regional Pain Syndrome (CRPS) was chosen. CRPS is a painful and debilitating syndrome with a complex etiology that is still unraveled for a considerable part, resulting in suboptimal diagnosis and treatment.

Results: A text mining based approach combined with a simple network analysis identified Nuclear Factor kappa B (NFkappaB) as a possible central mediator in both the initiation and progression of CRPS.

Conclusion: The result shows the added value of a multidisciplinary approach combined with information retrieval in hypothesis discovery in biomedical research. The new hypothesis, which was derived in silico, provides a framework for further mechanistic studies into the underlying molecular mechanisms of CRPS and requires evaluation in clinical and epidemiological studies.

背景:在复杂疾病的调查中，医生和基础科学家的合作努力往往是必要的。但是，当需要审查大量信息时，就会出现困难。先进的信息检索技术有助于整合和审查从各个科学领域获得的数据。在本文中，一组具有不同背景的研究人员应用了先进的信息检索方法，以文本挖掘和实体关系工具的形式，来回顾当前的文献，以期对复杂疾病的分子机制产生新的见解。作为这种疾病的一个例子，复杂区域疼痛综合征(CRPS)被选择。CRPS是一种痛苦和使人衰弱的综合征，其病因复杂，在很大程度上仍未解开，导致诊断和治疗不理想。结果:基于文本挖掘的方法结合简单的网络分析确定了核因子κ B (NFkappaB)可能是CRPS发生和进展的中心介质。结论:多学科结合信息检索方法在生物医学研究假设发现中的附加价值。这一新假设来源于计算机，为进一步研究CRPS的潜在分子机制提供了一个框架，需要在临床和流行病学研究中进行评估。

{"title":"Applied information retrieval and multidisciplinary research: new mechanistic hypotheses in complex regional pain syndrome.","authors":"Kristina M Hettne, Marissa de Mos, Anke G J de Bruijn, Marc Weeber, Scott Boyer, Erik M van Mulligen, Montserrat Cases, Jordi Mestres, Johan van der Lei","doi":"10.1186/1747-5333-2-2","DOIUrl":"https://doi.org/10.1186/1747-5333-2-2","url":null,"abstract":"Background: Collaborative efforts of physicians and basic scientists are often necessary in the investigation of complex disorders. Difficulties can arise, however, when large amounts of information need to reviewed. Advanced information retrieval can be beneficial in combining and reviewing data obtained from the various scientific fields. In this paper, a team of investigators with varying backgrounds has applied advanced information retrieval methods, in the form of text mining and entity relationship tools, to review the current literature, with the intention to generate new insights into the molecular mechanisms underlying a complex disorder. As an example of such a disorder the Complex Regional Pain Syndrome (CRPS) was chosen. CRPS is a painful and debilitating syndrome with a complex etiology that is still unraveled for a considerable part, resulting in suboptimal diagnosis and treatment.Results: A text mining based approach combined with a simple network analysis identified Nuclear Factor kappa B (NFkappaB) as a possible central mediator in both the initiation and progression of CRPS.Conclusion: The result shows the added value of a multidisciplinary approach combined with information retrieval in hypothesis discovery in biomedical research. The new hypothesis, which was derived in silico, provides a framework for further mechanistic studies into the underlying molecular mechanisms of CRPS and requires evaluation in clinical and epidemiological studies.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2007-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-2-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26705394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

Biological information specialists for biological informatics. 生物信息学的生物信息专家。

Journal of biomedical discovery and collaboration

Pub Date : 2007-02-12 DOI: 10.1186/1747-5333-2-1

P Bryan Heidorn, Carole L Palmer, Dan Wright

Data management and integration are complicated and ongoing problems that will require commitment of resources and expertise from the various biological science communities. Primary components of successful cross-scale integration are smooth information management and migration from one context to another. We call for a broadening of the definition of bioinformatics and bioinformatics training to span biological disciplines and biological scales. Training programs are needed that educate a new kind of informatics professional, Biological Information Specialists, to work in collaboration with various discipline-specific research personnel. Biological Information Specialists are an extension of the informationist movement that began within library and information science (LIS) over 30 years ago as a professional position to fill a gap in clinical medicine. These professionals will help advance science by improving access to scientific information and by freeing scientists who are not interested in data management to concentrate on their science.

数据管理和整合是复杂和持续的问题，需要来自各个生物科学界的资源和专业知识的承诺。成功的跨规模集成的主要组成部分是平滑的信息管理和从一个上下文到另一个上下文的迁移。我们呼吁扩大生物信息学和生物信息学培训的定义，以跨越生物学科和生物尺度。培养一种新型的信息学专业人员，生物信息专家，与不同学科的研究人员合作，需要培训计划。生物信息专家是30多年前在图书馆和信息科学(LIS)内开始的信息主义运动的延伸，作为填补临床医学空白的专业职位。这些专业人员将通过改善对科学信息的获取，以及解放那些对数据管理不感兴趣的科学家，使他们能够专注于他们的科学，从而帮助推进科学的发展。

引用次数: 20

GOAnnotator: linking protein GO annotations to evidence text. GOAnnotator:将蛋白质GO注释链接到证据文本。

Journal of biomedical discovery and collaboration

Pub Date : 2006-12-20 DOI: 10.1186/1747-5333-1-19

Francisco M Couto, Mário J Silva, Vivian Lee, Emily Dimmer, Evelyn Camon, Rolf Apweiler, Harald Kirsch, Dietrich Rebholz-Schuhmann

Background: Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators.

Results: In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins.

Conclusion: The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one to achieve high precision, which is crucial for the efficient support of GO curators. GOAnnotator was implemented as a web tool that is freely available at http://xldb.di.fc.ul.pt/rebil/tools/goa/.

背景:用基因本体(GO)术语标注蛋白质是一项正在进行的工作，也是一项复杂的任务。手工GO标注精确、宝贵，但耗时长。因此，大多数蛋白质都带有自动生成的非精选注释，而不是精选注释。已经提出了使用文献进行自动注释的文本挖掘系统，但它们不能满足策展人的高质量期望。结果:在本文中，我们描述了一种将未经整理的注释与从文献中提取的文本联系起来的方法。文本的选择基于文本与来自未整理注释的术语的相似性。除了证实未经整理的注释，提取的文本也导致新的注释。此外，该方法利用GO层次结构实现了较高的精度。我们的方法被整合到GOAnnotator中，这是一个帮助UniProt蛋白GO注释管理过程的工具。结论:GO管理员使用一组66个不同的UniProt/SwissProt蛋白和未编辑的注释来评估GOAnnotator。GOAnnotator以93%的准确率提供了正确的证据文本。这种高精度的结果来自于使用GO层次结构只选择与GOA中未策划注释中的GO术语相似的GO术语。我们的方法是第一个实现高精度的方法，这对于GO策展人的有效支持至关重要。GOAnnotator是作为一个web工具实现的，可以在http://xldb.di.fc.ul.pt/rebil/tools/goa/上免费获得。

{"title":"GOAnnotator: linking protein GO annotations to evidence text.","authors":"Francisco M Couto, Mário J Silva, Vivian Lee, Emily Dimmer, Evelyn Camon, Rolf Apweiler, Harald Kirsch, Dietrich Rebholz-Schuhmann","doi":"10.1186/1747-5333-1-19","DOIUrl":"https://doi.org/10.1186/1747-5333-1-19","url":null,"abstract":"Background: Annotation of proteins with gene ontology (GO) terms is ongoing work and a complex task. Manual GO annotation is precise and precious, but it is time-consuming. Therefore, instead of curated annotations most of the proteins come with uncurated annotations, which have been generated automatically. Text-mining systems that use literature for automatic annotation have been proposed but they do not satisfy the high quality expectations of curators.Results: In this paper we describe an approach that links uncurated annotations to text extracted from literature. The selection of the text is based on the similarity of the text to the term from the uncurated annotation. Besides substantiating the uncurated annotations, the extracted texts also lead to novel annotations. In addition, the approach uses the GO hierarchy to achieve high precision. Our approach is integrated into GOAnnotator, a tool that assists the curation process for GO annotation of UniProt proteins.Conclusion: The GO curators assessed GOAnnotator with a set of 66 distinct UniProt/SwissProt proteins with uncurated annotations. GOAnnotator provided correct evidence text at 93% precision. This high precision results from using the GO hierarchy to only select GO terms similar to GO terms from uncurated annotations in GOA. Our approach is the first one to achieve high precision, which is crucial for the efficient support of GO curators. GOAnnotator was implemented as a web tool that is freely available at http://xldb.di.fc.ul.pt/rebil/tools/goa/.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-19","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26454294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 73

Karl Pribram, The James Arthur lectures, and what makes us human. 卡尔·普里布拉姆，詹姆斯·亚瑟讲座，以及我们是什么。

Journal of biomedical discovery and collaboration

Pub Date : 2006-11-29 DOI: 10.1186/1747-5333-1-15

Ian Tattersall

Background: The annual James Arthur lecture series on the Evolution of the Human Brain was inaugurated at the American Museum of Natural History in 1932, through a bequest from a successful manufacturer with a particular interest in mechanisms. Karl Pribram's thirty-ninth lecture of the series, delivered in 1970, was a seminal event that heralded much of the research agenda, since pursued by representatives of diverse disciplines, that touches on the evolution of human uniqueness.

Discussion: In his James Arthur lecture Pribram raised questions about the coding of information in the brain and about the complex association between language, symbol, and the unique human cognitive system. These questions are as pertinent today as in 1970. The emergence of modern human symbolic cognition is often viewed as a gradual, incremental process, governed by inexorable natural selection and propelled by the apparent advantages of increasing intelligence. However, there are numerous theoretical considerations that render such a scenario implausible, and an examination of the pattern of acquisition of behavioral and anatomical novelties in human evolution indicates that, throughout, major change was both sporadic and rare. What is more, modern bony anatomy and brain size were apparently both achieved well before we have any evidence for symbolic behavior patterns. This suggests that the biological substrate underlying the symbolic thought that is so distinctive of Homo sapiens today was exaptively achieved, long before its potential was actually put to use. In which case we need to look for the agent, perforce a cultural one, that stimulated the adoption of symbolic thought patterns. That stimulus may well have been the spontaneous invention of articulate language.

背景:詹姆斯·阿瑟关于人类大脑进化的年度系列讲座于1932年在美国自然历史博物馆开幕，这是一位对机械特别感兴趣的成功制造商的遗赠。1970年，卡尔·普里布拉姆(Karl Pribram)发表了该系列的第39次演讲，这是一个开创性的事件，预示了许多研究议程，因为不同学科的代表都在追求，涉及人类独特性的进化。讨论:在他的James Arthur讲座中，Pribram提出了关于大脑中信息编码的问题，以及关于语言、符号和独特的人类认知系统之间复杂联系的问题。这些问题在今天和1970年一样切题。现代人类符号认知的出现通常被视为一个渐进的过程，受到无情的自然选择的支配，并受到日益增长的智力的明显优势的推动。然而，有许多理论上的考虑使这种情况难以置信，并且对人类进化中行为和解剖学新异的获得模式的检查表明，在整个过程中，重大变化既零星又罕见。更重要的是，现代骨骼解剖和大脑大小显然都是在我们有任何符号行为模式的证据之前实现的。这表明，象征思想的生物学基础是今天智人如此独特的，远在它的潜力真正得到利用之前就已经成熟了。在这种情况下，我们需要寻找动因，必然是文化动因，它刺激了象征性思维模式的采用。这种刺激很可能是清晰语言的自发发明。

{"title":"Karl Pribram, The James Arthur lectures, and what makes us human.","authors":"Ian Tattersall","doi":"10.1186/1747-5333-1-15","DOIUrl":"https://doi.org/10.1186/1747-5333-1-15","url":null,"abstract":"Background: The annual James Arthur lecture series on the Evolution of the Human Brain was inaugurated at the American Museum of Natural History in 1932, through a bequest from a successful manufacturer with a particular interest in mechanisms. Karl Pribram's thirty-ninth lecture of the series, delivered in 1970, was a seminal event that heralded much of the research agenda, since pursued by representatives of diverse disciplines, that touches on the evolution of human uniqueness.Discussion: In his James Arthur lecture Pribram raised questions about the coding of information in the brain and about the complex association between language, symbol, and the unique human cognitive system. These questions are as pertinent today as in 1970. The emergence of modern human symbolic cognition is often viewed as a gradual, incremental process, governed by inexorable natural selection and propelled by the apparent advantages of increasing intelligence. However, there are numerous theoretical considerations that render such a scenario implausible, and an examination of the pattern of acquisition of behavioral and anatomical novelties in human evolution indicates that, throughout, major change was both sporadic and rare. What is more, modern bony anatomy and brain size were apparently both achieved well before we have any evidence for symbolic behavior patterns. This suggests that the biological substrate underlying the symbolic thought that is so distinctive of Homo sapiens today was exaptively achieved, long before its potential was actually put to use. In which case we need to look for the agent, perforce a cultural one, that stimulated the adoption of symbolic thought patterns. That stimulus may well have been the spontaneous invention of articulate language.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-15","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26413299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

What makes us human: revisiting an age-old question in the genomic era. 什么使我们成为人类:在基因组时代重新审视一个古老的问题。

Journal of biomedical discovery and collaboration

Pub Date : 2006-11-29 DOI: 10.1186/1747-5333-1-18

Nitzan Mekel-Bobrov, Bruce T Lahn

In 1970, Karl Pribram took on the immense challenge of asking the question, what makes us human? Nearly four decades later, the most significant finding has been the undeniable realization of how incredibly subtle and fine-scaled the unique biological features of our species must be. The recent explosion in the availability of large-scale sequence data, however, and the consequent emergence of comparative genomics, are rapidly transforming the study of human evolution. The field of comparative genomics is allowing us to reach unparalleled resolution, reframing our questions in reference to DNA sequence--the very unit that evolution operates on. But like any reductionist approach, it comes at a price. Comparative genomics may provide the necessary resolution for identifying rare DNA sequence differences in a vast sea of conservation, but ultimately we will have to face the challenge of figuring out how DNA sequence divergence translates into phenotypic divergence. Our goal here is to provide a brief outline of the major findings made in the study of human brain evolution since the Pribram lecture, focusing specifically on the field of comparative genomics. We then discuss the broader implications of these findings and the future challenges that are in store.

1970年，卡尔·普里布拉姆(Karl Pribram)接受了一个巨大的挑战，他提出了一个问题:是什么让我们成为人类?近四十年后，最重要的发现是不可否认地认识到我们物种独特的生物特征是多么难以置信的微妙和精细。然而，最近大规模序列数据的爆炸性增长，以及随之而来的比较基因组学的出现，正在迅速改变人类进化的研究。比较基因组学领域使我们能够达到无与伦比的分辨率，根据DNA序列重新构建我们的问题，而DNA序列正是进化所依赖的单元。但就像任何简化主义的方法一样，这是有代价的。比较基因组学可能会提供必要的解决方案，在浩瀚的保护海洋中识别罕见的DNA序列差异，但最终我们将不得不面对弄清楚DNA序列差异如何转化为表型差异的挑战。我们的目标是简要概述自Pribram讲座以来在人类大脑进化研究中取得的主要发现，特别关注比较基因组学领域。然后，我们讨论了这些发现的更广泛的含义以及未来面临的挑战。

{"title":"What makes us human: revisiting an age-old question in the genomic era.","authors":"Nitzan Mekel-Bobrov, Bruce T Lahn","doi":"10.1186/1747-5333-1-18","DOIUrl":"https://doi.org/10.1186/1747-5333-1-18","url":null,"abstract":"In 1970, Karl Pribram took on the immense challenge of asking the question, what makes us human? Nearly four decades later, the most significant finding has been the undeniable realization of how incredibly subtle and fine-scaled the unique biological features of our species must be. The recent explosion in the availability of large-scale sequence data, however, and the consequent emergence of comparative genomics, are rapidly transforming the study of human evolution. The field of comparative genomics is allowing us to reach unparalleled resolution, reframing our questions in reference to DNA sequence--the very unit that evolution operates on. But like any reductionist approach, it comes at a price. Comparative genomics may provide the necessary resolution for identifying rare DNA sequence differences in a vast sea of conservation, but ultimately we will have to face the challenge of figuring out how DNA sequence divergence translates into phenotypic divergence. Our goal here is to provide a brief outline of the major findings made in the study of human brain evolution since the Pribram lecture, focusing specifically on the field of comparative genomics. We then discuss the broader implications of these findings and the future challenges that are in store.","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-18","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26413216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7