基于条件随机场的生物医学摘要结果识别

2008 IEEE International Conference on Information Reuse and Integration Pub Date : 2008-07-13 DOI:10.1109/IRI.2008.4583016

Ryan T. K. Lin, Hong-Jie Dai, Yue-Yang Bow, Min-Yuh Day, Richard Tzong-Han Tsai, W. Hsu

{"title":"基于条件随机场的生物医学摘要结果识别","authors":"Ryan T. K. Lin, Hong-Jie Dai, Yue-Yang Bow, Min-Yuh Day, Richard Tzong-Han Tsai, W. Hsu","doi":"10.1109/IRI.2008.4583016","DOIUrl":null,"url":null,"abstract":"For biomedical research, the most important parts of an abstract are the result and conclusion sections. Some journals divide an abstract into several sections so that readers can easily identify those parts, but others do not. We propose a method that can automatically identify the result and conclusion sections of any biomedical abstracts by formulating this identification problem as a sequence labeling task. Three feature sets (Position, Named Entity, and Word Frequency) are employed with Conditional Random Fields (CRFs) as the underlying machine learning model. Experimental results show that the combination of our proposed feature sets can achieve F-measure, precision, and recall scores of 92.50%, 95.32% and 89.85%, respectively.","PeriodicalId":169554,"journal":{"name":"2008 IEEE International Conference on Information Reuse and Integration","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Result identification for biomedical abstracts using Conditional Random Fields\",\"authors\":\"Ryan T. K. Lin, Hong-Jie Dai, Yue-Yang Bow, Min-Yuh Day, Richard Tzong-Han Tsai, W. Hsu\",\"doi\":\"10.1109/IRI.2008.4583016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For biomedical research, the most important parts of an abstract are the result and conclusion sections. Some journals divide an abstract into several sections so that readers can easily identify those parts, but others do not. We propose a method that can automatically identify the result and conclusion sections of any biomedical abstracts by formulating this identification problem as a sequence labeling task. Three feature sets (Position, Named Entity, and Word Frequency) are employed with Conditional Random Fields (CRFs) as the underlying machine learning model. Experimental results show that the combination of our proposed feature sets can achieve F-measure, precision, and recall scores of 92.50%, 95.32% and 89.85%, respectively.\",\"PeriodicalId\":169554,\"journal\":{\"name\":\"2008 IEEE International Conference on Information Reuse and Integration\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Conference on Information Reuse and Integration\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI.2008.4583016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Information Reuse and Integration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2008.4583016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

对于生物医学研究，摘要中最重要的部分是结果和结论部分。有些期刊将摘要分成几个部分，以便读者可以很容易地识别这些部分，但有些期刊则不这样做。我们提出了一种可以自动识别任何生物医学摘要的结果和结论部分的方法，通过将这种识别问题表述为序列标记任务。三个特征集(位置、命名实体和词频)与条件随机场(CRFs)一起作为底层机器学习模型。实验结果表明，我们提出的特征集组合可以分别达到92.50%、95.32%和89.85%的F-measure、precision和recall分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Result identification for biomedical abstracts using Conditional Random Fields

For biomedical research, the most important parts of an abstract are the result and conclusion sections. Some journals divide an abstract into several sections so that readers can easily identify those parts, but others do not. We propose a method that can automatically identify the result and conclusion sections of any biomedical abstracts by formulating this identification problem as a sequence labeling task. Three feature sets (Position, Named Entity, and Word Frequency) are employed with Conditional Random Fields (CRFs) as the underlying machine learning model. Experimental results show that the combination of our proposed feature sets can achieve F-measure, precision, and recall scores of 92.50%, 95.32% and 89.85%, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 IEEE International Conference on Information Reuse and Integration

自引率

0.00%

发文量

期刊最新文献

An unsupervised protein sequences clustering algorithm using functional domain information FACT: A fusion architecture with contract templates for semantic and syntactic integration Data component based management of reservoir simulation models RFID composite event definition and detection Analysis methodology for project design utilizing UML