Result identification for biomedical abstracts using Conditional Random Fields

2008 IEEE International Conference on Information Reuse and Integration Pub Date : 2008-07-13 DOI:10.1109/IRI.2008.4583016

Ryan T. K. Lin, Hong-Jie Dai, Yue-Yang Bow, Min-Yuh Day, Richard Tzong-Han Tsai, W. Hsu

引用次数: 6

Abstract

For biomedical research, the most important parts of an abstract are the result and conclusion sections. Some journals divide an abstract into several sections so that readers can easily identify those parts, but others do not. We propose a method that can automatically identify the result and conclusion sections of any biomedical abstracts by formulating this identification problem as a sequence labeling task. Three feature sets (Position, Named Entity, and Word Frequency) are employed with Conditional Random Fields (CRFs) as the underlying machine learning model. Experimental results show that the combination of our proposed feature sets can achieve F-measure, precision, and recall scores of 92.50%, 95.32% and 89.85%, respectively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于条件随机场的生物医学摘要结果识别

对于生物医学研究，摘要中最重要的部分是结果和结论部分。有些期刊将摘要分成几个部分，以便读者可以很容易地识别这些部分，但有些期刊则不这样做。我们提出了一种可以自动识别任何生物医学摘要的结果和结论部分的方法，通过将这种识别问题表述为序列标记任务。三个特征集(位置、命名实体和词频)与条件随机场(CRFs)一起作为底层机器学习模型。实验结果表明，我们提出的特征集组合可以分别达到92.50%、95.32%和89.85%的F-measure、precision和recall分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2008 IEEE International Conference on Information Reuse and Integration

自引率

0.00%

发文量

期刊最新文献

An unsupervised protein sequences clustering algorithm using functional domain information FACT: A fusion architecture with contract templates for semantic and syntactic integration Data component based management of reservoir simulation models RFID composite event definition and detection Analysis methodology for project design utilizing UML