{"title":"Enlarging drug dictionary with semi-supervised learning for Drug Entity Recognition","authors":"Donghuo Zeng, Chengjie Sun, Lei Lin, Bingquan Liu","doi":"10.1109/BIBM.2016.7822818","DOIUrl":null,"url":null,"abstract":"Drug Entity Recognition (DER) is a crucial task for information extraction in biomedical text. Much of previous work for DER using known drugs to build features, however, the known drug resources are limited. In this paper, we proposed a semi-supervised learning to extend an existing drug dictionary. With the extended dictionary, the features for DER can be enriched. Using Conditional Random Fields (CRF) model with the enriched features, an F-measure of 89.26% is achieved on DDIExtraction2013 challenge data set, which outperforms the best system of the DDIExtraction 2013 challenge.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Drug Entity Recognition (DER) is a crucial task for information extraction in biomedical text. Much of previous work for DER using known drugs to build features, however, the known drug resources are limited. In this paper, we proposed a semi-supervised learning to extend an existing drug dictionary. With the extended dictionary, the features for DER can be enriched. Using Conditional Random Fields (CRF) model with the enriched features, an F-measure of 89.26% is achieved on DDIExtraction2013 challenge data set, which outperforms the best system of the DDIExtraction 2013 challenge.