临床实体识别的交互式在线学习

HILDA '16 Pub Date : 2016-06-26 DOI:10.1145/2939502.2939510
L. Tari, Varish Mulwad, Anna von Reden
{"title":"临床实体识别的交互式在线学习","authors":"L. Tari, Varish Mulwad, Anna von Reden","doi":"10.1145/2939502.2939510","DOIUrl":null,"url":null,"abstract":"Named entity recognition and entity linking are core natural language processing components that are predominantly solved by supervised machine learning approaches. Such supervised machine learning approaches require manual annotation of training data that can be expensive to compile. The applicability of supervised, machine learning-based entity recognition and linking components in real-world applications can be hindered by the limited availability of training data. In this paper, we propose a novel approach that uses ontologies as a basis for entity recognition and linking, and captures context of neighboring tokens of the entities of interest with vectors based on syntactic and semantic features. Our approach takes user feedback so that the vector-based model can be continuously updated in an online setting. Here we demonstrate our approach in a healthcare context, using it to recognize body part and imaging modality entities within clinical documents, and map these entities to the right concepts in the RadLex and NCIT medical ontologies. Our current evaluation shows promising results on a small set of clinical documents with a precision and recall of 0.841 and 0.966. The evaluation also demonstrates that our approach is capable of continuous performance improvement with increasing size of examples. We believe that our human-in-the-loop, online learning approach to entity recognition and linking shows promise that it is suitable for real-world applications.","PeriodicalId":356971,"journal":{"name":"HILDA '16","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interactive online learning for clinical entity recognition\",\"authors\":\"L. Tari, Varish Mulwad, Anna von Reden\",\"doi\":\"10.1145/2939502.2939510\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named entity recognition and entity linking are core natural language processing components that are predominantly solved by supervised machine learning approaches. Such supervised machine learning approaches require manual annotation of training data that can be expensive to compile. The applicability of supervised, machine learning-based entity recognition and linking components in real-world applications can be hindered by the limited availability of training data. In this paper, we propose a novel approach that uses ontologies as a basis for entity recognition and linking, and captures context of neighboring tokens of the entities of interest with vectors based on syntactic and semantic features. Our approach takes user feedback so that the vector-based model can be continuously updated in an online setting. Here we demonstrate our approach in a healthcare context, using it to recognize body part and imaging modality entities within clinical documents, and map these entities to the right concepts in the RadLex and NCIT medical ontologies. Our current evaluation shows promising results on a small set of clinical documents with a precision and recall of 0.841 and 0.966. The evaluation also demonstrates that our approach is capable of continuous performance improvement with increasing size of examples. We believe that our human-in-the-loop, online learning approach to entity recognition and linking shows promise that it is suitable for real-world applications.\",\"PeriodicalId\":356971,\"journal\":{\"name\":\"HILDA '16\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"HILDA '16\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2939502.2939510\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"HILDA '16","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2939502.2939510","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

命名实体识别和实体链接是自然语言处理的核心组件,主要由监督机器学习方法解决。这种有监督的机器学习方法需要对训练数据进行手动注释,编译成本可能很高。有监督的、基于机器学习的实体识别和链接组件在现实应用中的适用性可能会受到训练数据有限可用性的阻碍。在本文中,我们提出了一种新的方法,该方法使用本体作为实体识别和链接的基础,并基于语法和语义特征的向量捕获感兴趣实体的相邻标记的上下文。我们的方法采用用户反馈,因此基于向量的模型可以在在线设置中不断更新。在这里,我们将在医疗保健上下文中演示我们的方法,使用它来识别临床文档中的身体部位和成像模式实体,并将这些实体映射到RadLex和NCIT医学本体中的正确概念。我们目前的评估在一小部分临床文献上显示出有希望的结果,准确率和召回率分别为0.841和0.966。评估还表明,我们的方法能够随着样本规模的增加而持续提高性能。我们相信,我们的人在循环,在线学习方法的实体识别和链接显示出它适用于现实世界的应用前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Interactive online learning for clinical entity recognition
Named entity recognition and entity linking are core natural language processing components that are predominantly solved by supervised machine learning approaches. Such supervised machine learning approaches require manual annotation of training data that can be expensive to compile. The applicability of supervised, machine learning-based entity recognition and linking components in real-world applications can be hindered by the limited availability of training data. In this paper, we propose a novel approach that uses ontologies as a basis for entity recognition and linking, and captures context of neighboring tokens of the entities of interest with vectors based on syntactic and semantic features. Our approach takes user feedback so that the vector-based model can be continuously updated in an online setting. Here we demonstrate our approach in a healthcare context, using it to recognize body part and imaging modality entities within clinical documents, and map these entities to the right concepts in the RadLex and NCIT medical ontologies. Our current evaluation shows promising results on a small set of clinical documents with a precision and recall of 0.841 and 0.966. The evaluation also demonstrates that our approach is capable of continuous performance improvement with increasing size of examples. We believe that our human-in-the-loop, online learning approach to entity recognition and linking shows promise that it is suitable for real-world applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
VisTrees: fast indexes for interactive data exploration PFunk-H: approximate query processing using perceptual models Towards reliable interactive data cleaning: a user survey and recommendations ModelDB: a system for machine learning model management TrendQuery: a system for interactive exploration of trends
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1