生物医学文献挖掘:“组学”时代的挑战与解决方案。

Damien Chaussabel
{"title":"生物医学文献挖掘:“组学”时代的挑战与解决方案。","authors":"Damien Chaussabel","doi":"10.2165/00129785-200404060-00005","DOIUrl":null,"url":null,"abstract":"<p><p>It is now obvious that the rate-limiting step in high throughput experimentation is neither data acquisition nor analysis, but rather our ability to interpret data on a genome-wide scale. Indeed, the explosion of data sampling capacity combined with increasing publication rates greatly impairs our ability to find meaning in vast collections of data. In order to support data interpretation, bioinformatic tools are needed to identify critical information contained in large bodies of literature. However, extracting knowledge embedded in free text is an arduous task, compounded in the biomedical field by an inconsistent gene nomenclature, domain-specific language and restricted access to full text articles. This paper presents a selection of currently available biomedical literature mining software. These tools rely on statistic and, more recently, semantic analyses (Natural Language Processing) to automatically extract information from the literature. In addition, a literature mining strategy has been developed to explore patterns of term occurrences in abstracts. This method automatically identifies relevant keywords in collections of abstracts, and uses a pattern discovery algorithm to generate a visual interface for exploring functional associations among genes. Term occurrence heatmaps can also be combined with gene expression profiles to provide valuable functional annotations. Furthermore, as demonstrated with tumor cell line literature profiling results, this approach can be applied to a variety of themes beyond genomic data analysis. Altogether, these examples illustrate how literature analysis can be employed to support knowledge discovery in biomedical research.</p>","PeriodicalId":72171,"journal":{"name":"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2165/00129785-200404060-00005","citationCount":"33","resultStr":"{\"title\":\"Biomedical literature mining: challenges and solutions in the 'omics' era.\",\"authors\":\"Damien Chaussabel\",\"doi\":\"10.2165/00129785-200404060-00005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>It is now obvious that the rate-limiting step in high throughput experimentation is neither data acquisition nor analysis, but rather our ability to interpret data on a genome-wide scale. Indeed, the explosion of data sampling capacity combined with increasing publication rates greatly impairs our ability to find meaning in vast collections of data. In order to support data interpretation, bioinformatic tools are needed to identify critical information contained in large bodies of literature. However, extracting knowledge embedded in free text is an arduous task, compounded in the biomedical field by an inconsistent gene nomenclature, domain-specific language and restricted access to full text articles. This paper presents a selection of currently available biomedical literature mining software. These tools rely on statistic and, more recently, semantic analyses (Natural Language Processing) to automatically extract information from the literature. In addition, a literature mining strategy has been developed to explore patterns of term occurrences in abstracts. This method automatically identifies relevant keywords in collections of abstracts, and uses a pattern discovery algorithm to generate a visual interface for exploring functional associations among genes. Term occurrence heatmaps can also be combined with gene expression profiles to provide valuable functional annotations. Furthermore, as demonstrated with tumor cell line literature profiling results, this approach can be applied to a variety of themes beyond genomic data analysis. Altogether, these examples illustrate how literature analysis can be employed to support knowledge discovery in biomedical research.</p>\",\"PeriodicalId\":72171,\"journal\":{\"name\":\"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.2165/00129785-200404060-00005\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2165/00129785-200404060-00005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2165/00129785-200404060-00005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33

摘要

现在很明显,高通量实验的限速步骤既不是数据采集也不是分析,而是我们在全基因组范围内解释数据的能力。事实上,数据采样能力的爆炸式增长,加上出版率的不断提高,极大地削弱了我们从大量数据中寻找意义的能力。为了支持数据解释,需要使用生物信息学工具来识别大量文献中包含的关键信息。然而,从自由文本中提取知识是一项艰巨的任务,在生物医学领域,不一致的基因命名法、特定领域的语言和对全文文章的限制使其复杂化。本文介绍了目前可用的生物医学文献挖掘软件的选择。这些工具依赖于统计和最近的语义分析(自然语言处理)来自动从文献中提取信息。此外,还开发了一种文献挖掘策略来探索摘要中术语出现的模式。该方法自动识别摘要集合中的相关关键词,并使用模式发现算法生成可视化界面,用于探索基因之间的功能关联。术语出现热图还可以与基因表达谱相结合,以提供有价值的功能注释。此外,正如肿瘤细胞系文献分析结果所证明的那样,这种方法可以应用于基因组数据分析之外的各种主题。总之,这些例子说明了如何利用文献分析来支持生物医学研究中的知识发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Biomedical literature mining: challenges and solutions in the 'omics' era.

It is now obvious that the rate-limiting step in high throughput experimentation is neither data acquisition nor analysis, but rather our ability to interpret data on a genome-wide scale. Indeed, the explosion of data sampling capacity combined with increasing publication rates greatly impairs our ability to find meaning in vast collections of data. In order to support data interpretation, bioinformatic tools are needed to identify critical information contained in large bodies of literature. However, extracting knowledge embedded in free text is an arduous task, compounded in the biomedical field by an inconsistent gene nomenclature, domain-specific language and restricted access to full text articles. This paper presents a selection of currently available biomedical literature mining software. These tools rely on statistic and, more recently, semantic analyses (Natural Language Processing) to automatically extract information from the literature. In addition, a literature mining strategy has been developed to explore patterns of term occurrences in abstracts. This method automatically identifies relevant keywords in collections of abstracts, and uses a pattern discovery algorithm to generate a visual interface for exploring functional associations among genes. Term occurrence heatmaps can also be combined with gene expression profiles to provide valuable functional annotations. Furthermore, as demonstrated with tumor cell line literature profiling results, this approach can be applied to a variety of themes beyond genomic data analysis. Altogether, these examples illustrate how literature analysis can be employed to support knowledge discovery in biomedical research.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Towards molecular medicine: a case for a biological periodic table. Genetic testing in Crohn disease: utility in individualizing patient management. Identifying DNA methylation biomarkers of cancer drug response. The Autism Genome Project: goals and strategies. Oncogenes as novel targets for cancer therapy (part II): Intermediate signaling molecules.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1